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Abstract 

In recent work, Ozgiir, Leveque, and Tse (2007) obtained a complete scaling characterization of throughput 
scaling for random extended wireless networks (i.e., n nodes are placed uniformly at random in a square region of 
area n). They showed that for small path-loss exponents a G (2, 3] cooperative communication is order optimal, 
and for large path-loss exponents a > 3 multi-hop communication is order optimal. However, their results (both 
^^ ■ the communication scheme and the proof technique) are strongly dependent on the regularity induced with high 

probability by the random node placement. 

In this paper, we consider the problem of characterizing the throughput scaling in extended wireless networks 

with arbitrary node placement. As a main result, we propose a more general novel cooperative communication 

scheme that works for arbitrarily placed nodes. For small path-loss exponents a E (2, 3], we show that our scheme 

is order optimal for all node placements, and achieves exactly the same throughput scaling as in Ozgiir et al. This 

shows that the regularity of the node placement does not affect the scaling of the achievable rates for a £ (2, 3]. 

CO i The situation is, however, markedly different for large path-loss exponents a > 3. We show that in this regime the 

scaling of the achievable per-node rates depends crucially on the regularity of the node placement. We then present a 

family of schemes that smoothly "interpolate" between multi-hop and cooperative communication, depending upon 

the level of regularity in the node placement. We establish order optimality of these schemes under adversarial 

C/3 [ node placement for a > 3. 

O 

Index Terms 

m 

^ ■ Arbitrary node placement, capacity scaling, cooperative communication, hierarchical relaying, multi-hop com- 

lO ' munication, wireless networks. 
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^. ! I. Introduction 

Consider a wireless network with n nodes placed on [0, -\/ri]^ (usually referred to as an extended 

t^ ' network), with each node being the source for one of n source-destination pairs and the destination for 

'^ another pair. The performance of this network is captured by p*{n), the largest uniformly achievable rate 

J> ■ of communication between these source-destination pairs. While the scaling behavior of p*{n) as the 

K^ , number of nodes n goes to infinity is by now well understood for random node placement, little is known 

for the case of arbitrary node placements. In this paper, we are interested in analyzing the impact of such 

arbitrary node placement on the scaling of p*{n). 

A. Related Work 

The problem of determining the scaling of p*{n) was first analyzed by Gupta and Kumar in [1]. They 
show that, under random placement of nodes in the region, certain models of communication motivated 
by current technology, and random source-destination pairing, the maximum achievable per-node rate 
p*{n) can scale at most as 0(?t,~^/^). Moreover, it was shown that multi-hop communication can achieve 
essentially the same order of scaling. 

Since [1], the problem has received a considerable amount of attention. One stream of work [2]-[8] has 
progressively broadened the conditions on the channel model and the communication model, under which 
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multi-hop communication is order optimal. Specifically, with a power loss of r"" for signals sent over 
distance r, it has been established that under high signal attenuation a > 3 and random node placement, the 
best achievable per-node rate p*{n) for random source-destination pairing scales essentially like ©(ri"^/^) 
and that this scaling is achievable with multi-hop communication. 

Another stream of work [8]-[12] has proposed progressively refined multi-user cooperative schemes, 
which have been shown to significantly out-perform multi-hop communication in certain environments. In 
an exciting recent work, Ozgiir et al. [8] have shown that with nodes placed uniformly at random, and with 
low signal attenuation a E (2,3], a cooperative communication scheme can perform significantly better 
than multi-hop communication. More precisely, they show that for a G (2, 3], the best achievable per-node 
rate for random source-destination pairing scales as p*{n) = 0{n^~"^'^^'^) and cooperative communication 
achieves a per-node rate of fi(r2^~"/^~^) (here, e > is an arbitrary but fixed constant). That is, cooperative 
communication is essentially order optimal in the attenuation regime a E (2,3]. 

In summary, for random extended networks with random source-destination pairing, the optimal commu- 
nication scheme exhibits the following threshold behavior: for a E (2,3] the cooperative communication 
scheme is order optimal, while for a > 3 the multi-hop communication scheme is order optimal. 

B. Our Contributions 

The characterization of the scaling of p*{n) as a function of the path-loss exponent a mentioned in 
the last paragraph depends critically on the regularity induced with high probability by placing the nodes 
uniformly at random. However, a wireless network encountered in practice might not exhibit this amount 
of regularity. Our interest is therefore in understanding the impact of the node placement on the scaling 
of p*{n). To this end, we consider wireless networks with arbitrary (i.e., deterministic) node placement 
(with minimum- separation constraint). 

The impact of this arbitrary node placement depends crucially on the path-loss exponent a. For small 
path-loss exponents a E (2,3], we show that for random source-destination pairing, the rate of the 
best communication scheme is upper bounded as p*{n) = 0{log^{n)n^~"^'^). We then present a novel 
cooperative communication scheme that achieves for any path-loss exponent a > 2 a per-node rate of 
p^^(n) > n^-°'/'^~°W _ Thus, our cooperative communication scheme is essentially order optimal for any 
such arbitrary network with a E (2, 3]. In other words, in the small path-loss regime, the scaling of p*{n) 
is the same irrespective of the regularity of the node placement. 

The situation is, however, quite different for large path-loss exponents a > 3. We show that in this 
regime the scaling of p*{n) depends crucially on the regularity of the node placement, and multi-hop 
communication may not be order optimal for any value of a. In fact, for less regular networks we 
need more complicated cooperative communication schemes to achieve optimal network performance. 
Towards that end, we present a family of communication schemes that smoothly "interpolate" between 
cooperative communication and multi-hop communication, and in which nodes communicate at scales that 
vary smoothly from local to global. The amount of "interpolation" between the cooperative and multi-hop 
schemes depends on the level of regularity of the underlying node placement. We establish the optimality 
of this family of schemes for all a > 3 under adversarial node placement. 

In summary, for a E (2, 3] the regularity of the node placement has no impact on the scaling of p*{n). 
Cooperative communication is order optimal in this regime and achieves the same scaling as in the case 
of random node placement. For a > 3 the regularity of the node placement strongly impacts the scaling of 
p*{n), and a communication scheme "interpolating" between multi-hop and cooperative communication 
depending on the regularity of the node placement is order optimal (under adversarial node placement). 
In particular, simple multi-hop communication may not be order optimal for any a > 3. This contrasts 
with the case of random node placement where multi-hop communication is order optimal for all a > 3. 

C. Organization 

The remainder of this paper is organized as follows. Section |Il] describes in detail the communication 
model. Section |lll] provides formal statements of our results. Sections |IV] and |V] describe our new 



cooperative communication scheme (for the a G (2, 3] regime) and "interpolation" scheme (for the a > 3 
regime) for arbitrary wireless networks. Sections |VI] through |XI] contain proofs. Finally, Sections IXIII 
and IXIIII contain discussions and concluding remarks. 

II. Model 

In this section, we introduce some notational conventions and describe in detail the network and channel 
models. 

We use the following conventions: Ki for different i denote strictly positive finite constants independent 
of n. Vectors and matrices are denoted by boldface whenever the vector or matrix structure is of importance. 
We denote by (■)^ and (■)^ transpose and conjugate transpose, respectively. To simplify notation, we 
assume, when necessary, that fractions are integers and omit [■] and [-J operators. 

Consider the square 

of area n, and let V{n) C A{n) be a set of |^(n)| = n nodes orlll A{n). We say that V{n) has minimum- 
separation Tmin if ru^v > Tram for all u,v E V{n), where ru,v is the Euclidean distance between nodes u 
and V. We use the same channel model as in [8]. Namely, the (sampled) received signal at node v is 

Vv M = X] K,v [t] Xu [t] + Z^[t\ ( 1 ) 

ueV(n)\{v} 

for all V E V{n), and where {x,j[t]},j t are the (sampled) signals sent by the nodes in V{n). Here {2;t,[t]}i, t 
are independent and identically distributed (i.i.d.) with distribution A/c(0, 1) (i.e., circularly symmetric 
complex Gaussian with mean and variance 1), and 

for path-loss exponent a > 2. We assume that for each t G N, the phases {6'u^,[t]}„t, are i.i.do with 
uniform distribution on [0, 2ti). We either assume that for each u,v E V{n) the random process {6'„ t,[t]}f 
is stationary ergodic in t, which is called fast fading in the following, or that for each u,v E V{n) the 
random process {^u,t,[t]}t is constant in t, which is called slow fading in the following. In either case, we 
assume full channel state information (CSI) is available at all nodes, i.e., each node knows all {^u,^[t]}u,t, 
at time t. While the full CSI assumption is quite strong, it can be shown that availability of a 2-bit 
quantized version of {6'„t,[t]}„t, at all nodes is sufficient for the achievable schemes presented here (see 
Section IXII-AI for the details). We also impose an average power constraint of 1 on the signal {xu[t]}i 
for every node u E V{n). 

Each node u E V{n) wants to transmit information at uniform rate p{n) to some other node w E V{n). 
We call u the source and w the destination node of this communication pair. The set of all communication 
pairs can be described by a traffic matrix A(n) E {0, l}">^", where the entry in \{n) corresponding to 
(m, w) is equal to 1 if node m is a source for node w. We say that \{n) is a. permutation traffic matrix if it is 
a permutation matrix (i.e., every node is a source for exactly one communication pair and a destination for 
exactly one communication pair). For a traffic matrix \{n), let p*{n) be the highest rate of communication 
that is uniformly achievable for each source-destination pair. For a permutation traffic matrix A(n), p*{n) 
can also be understood as the maximal achievable per-node rate. 

'The setting considered here with n nodes placed on a square of area n is called an extended network. If the n nodes are placed on a 
square of unit area, we speak of a dense network. While dense networks are not treated in detail in this paper, we briefly discuss implications 
of the results for the dense setting in Section IXII-CI 

^It is worth pointing out that recent work [13] suggests that, under certain assumptions on scattering elements, for a £ (2, 3), and for very 
large values of n, the i.i.d. phase assumption as a function of u,v £ V{n) used here is too optimistic. However, subsequent work by the 
same authors [14] shows that under different assumptions on the scatterers, the channel model used here is still valid even for a G (2,3), 
and for very large values of n. This indicates that the question of channel modeling for very large networks in the low path-loss regime is 
somewhat delicate and requires further investigation. We point out that for a > 3 this issue does not arise. 



III. Main Results 

This section presents the formal statement of our results. The results are divided into two parts. In Section 
IIII-A[ we consider low path-loss exponents, i.e., a G (2,3]. We present a cooperative communication 
scheme for arbitrary node placement and for either fast or slow fading. We show that this communication 
scheme is order optimal for all node placements when a E (2,3]. In Section IIII-B[ we consider high 
path-loss exponents, i.e., a > 3. We present a communication scheme that "interpolates" between the 
cooperative and the multi-hop communication schemes, depending on the regularity of the node placement. 
We show that this communication scheme is order optimal under adversarial node placement with regularity 
constraint when a > 3. 

A. Low Path Loss Regime a G (2, 3] 

The first result proposes a novel communication scheme, called hierarchical relaying in the following, 
and bounds the per-node rate p^^{n) that it achieves. This provides a lower bound to p*{n), the largest 
achievable per-node rate. The hierarchical relaying scheme enables cooperative communication on the 
scale of the network size. In the random node placement case, this cooperation could be enabled in a 
cluster around the source node (cooperatively transmitting) and in a cluster around its destination node 
(cooperatively receiving). With arbitrary node placement, such an approach does no longer work, as both 
the source as well as the destination nodes may be isolated. The hierarchical relaying scheme circumvents 
this issue by relaying data between each source-destination pair over a densely populated region in the 
network. A detailed description of this scheme is provided in Section |IVl the proof of Theorem [T] is 
contained in Section IVIII 

Theorem 1. Under fast fading, for any a > 2, rmin E (0, 1), and 6 G (0, 1/2), there exists 

such that for any n, node placement V{n) with minimum separation r^^^, and permutation traffic matrix 
X{n), we have 

p*{n) >p^^{n) >6i(n)ni-"/2_ 

The same conclusion holds for slow fading with probability at least 

l-exp(-2^('°^^^^"^("))) = 1-0(1) 

as n —* oo. 

Theorem [H shows that the per-node rate p'^^{n) achievable by the hierarchical relaying scheme is at 
least ni~"/2-/3(")^ where the "loss" term I3{n) converges to zero as n — i> oo at a rate arbitrarily close to 
0(log~^'^(n)) (by choosing 5 small). The performance of the hierarchical relaying scheme can intuitively 
be understood as follows. As mentioned before, the scheme achieves cooperation on a global scale. This 
leads to a multi-antenna gain of order n. On the other hand, communication is over a distance of order 
n^/^, leading to a power loss of order n~"/^. Combining these two factors results in a per-node rate of 

We note that Theorem [1] remains valid under somewhat weaker conditions than having minimum 
separation rmin £ (0, 1). Specifically, we show that the result of Ozgiir et al. [8] can be recovered through 
Theorem [Has the random node placement satisfies these weaker conditions. We discuss this in more detail 
in Section IXII-DI 

The next theorem establishes optimality of the hierarchical relaying scheme in the range of a G (2, 3] 
for arbitrary node placement. The proof of the theorem is presented in Section IVIIII 



Theorem 2. Under either fast or slow fading, for any a G (2,3], rmin £ (0,1), there exists h2{n) = 
0(log^(n)) such that for any n, node placement V{n) with minimum separation rmm, and for \{n) 
chosen uniformly at random from the set of all permutation traffic matrices, we have 

p*{n) < b2{n)n^""/^ 

with probability 1 — o{l) as n ^ oo. 

Note that Theorem [2] holds only with probability 1 — o(l) for different reasons for the slow and fast 
fading case. For fast fading, this is due to the randomness in the selection of the permutation traffic matrix. 
In other words, for fast fading, with high probability we select a traffic matrix for which the theorem 
holds. For the slow fading case, there is additional randomness due to the fading realization. Here, with 
high probability we select a traffic matrix and we experience a fading for which the theorem hold. 

Comparing Theorems [T] and [2l we see that for a E (2,3] the proposed hierarchical relaying scheme is 
order optimal, in the sense that 

n^oo log(n) n^oo log(n) 

Moreover, the rate it achieves is the same order as is achievable in the case of randomly placed nodes. 
Hence in the low path-loss regime a E (2,3], the heterogeneity caused by the arbitrary node placement 
has no effect on achievable communication rates. 

B. High Path Loss Regime a > 3 

We now turn to the high path-loss regime a > 3. In the case of randomly placed nodes, multi-hop 
communication achieves a per-node rate of p^^{n) = r2(n^^/^) with probability 1 — o(l) and is order 
optimal for a > 3. For arbitrarily placed nodes, the situation is quite different as Theorem [3] shows. The 
proof of Theorem [3] is contained in Section |IXl 

Theorem 3. Under either fast or slow fading, for any a > 3, for any n, there exists a node placement 
V{n) with minimum separation 1/2 such that for \{n) chosen uniformly at random from the set of all 
permutation traffic matrices, we have 

p*{n) <22+5V-"/^ 

p^^{n) <4"n-"/2, 

as n -^ oo with probability 1 — o(l). 

Comparing Theorem [3] with Theorem [T] shows that under adversarial node placement with minimum- 
separation constraint the hierarchical relaying scheme is order optimal even when « > 3. Moreover, 
Theorem [3] shows that there exist node placements satisfying a minimum separation constraint for which 
hierarchical relaying achieves a rate of at least a factor of order n higher than multi-hop communication 
for any a > 3. In other words, for those node placements cooperative communication is necessary for 
order optimality also for any a > 3, in stark contrast to the situation with random node placement, where 
multi-hop communication is order optimal for all a > 3. 

Theorem [3] suggests that it is the level of regularity of the node placement that decides what scheme to 
choose for path-loss exponent a > 3. So far, we have seen two extreme cases: For random node placement, 
resulting in very regular node placements with high probability, only local cooperation is necessary and 
multi-hop is an order-optimal communication scheme. For adversarial arbitrary node placement, resulting 
in a very irregular node placement, global cooperation is necessary and hierarchical relaying is an order- 
optimal communication scheme. We now make this notion of regularity precise, and show that, depending 
on the regularity of the node placement, an appropriate "interpolation" between multi-hop and hierarchical 
relaying is required for a > 3 to achieve the optimal performance. We refer to this "interpolation" scheme 
as cooperative multi-hop communication in the following. 



Before we state the result, we need to introduce some notation. Consider again a node placement 
V{n) C A{n) with minimum separation r^nm G (0, 1). Divide A{n) into squares of sidelength d{n) < ^/n, 
and fix a constant /i G (0, 1]. We say that V{n) is fi-regular at resolution d{n) if every such square 
contains at least iid?[n) nodes. Note that every node placement is trivially 1-regular at resolution ^/n\ a 
random node placement can be shown to be yU -regular at resolution log(n) with probability 1 — o(l) as 
n — i> oo for any /i < 1; and nodes that are placed on each point in the integer lattice inside A{n) are 
1-regular at resolution 1. 

The cooperative multi-hop scheme enables cooperative communication on the scale of regularity d{n). 
Neighboring squares of sidelength d{n) cooperatively communicate with each other. To transmit between 
a source and its destination, we use multi-hop communication over those squares. In other words, we 
use cooperative communication at small scale d{n), and multi-hop communication at large scale y/n. For 
regular node placements, i.e., d{n) = 1, the cooperative multi-hop scheme becomes the classical multi-hop 
scheme. For very irregular node placement, i.e., d{n) = n^^"^, the cooperative multi-hop scheme becomes 
the hierarchical relaying scheme discussed in the last section. 

The next theorem provides a lower bound on the per-node rate p^^^ (n) achievable with the cooperative 
multi-hop scheme. The proof of the theorem can be found in Section 1X1 

Theorem 4. Under fast fading, for any a > 2, rmm ^ (0, 1), /i G (0, 1), and 6 G (0, 1/2) there exists 

such that for any n, node placement V{n) with minimum separation rmin, and permutation traffic matrix 
\{n), we have 

p\n) > p^^^{n) > b3{n)d*^-''{n)n-^/^, 

where 

d*{n) = min{/i : V{n) is /i regular at resolution h}. 

The same conclusion holds for slow fading with probability 1 — o(l) as n -^ oo. 

Theorem |4] shows that if V{n) is regular at resolution d*{n) then a per-node rate of at least p^^^{n) > 
^*3-aj^^-j^-i/2-/3(n) jg achievablc, where, as before, the "loss" term P(n) converges to zero as n ^ oo 
at a rate arbitrarily close to 0(log~^'^(n)). The performance of the cooperative multi-hop scheme can 
intuitively be understood as follows. The scheme achieves cooperation on a scale of d'^{n). This leads to 
a multi-antenna gain of order d'^{n). On the other hand, communication is over a distance of order d{n), 
leading to a power loss of order d~°'{n). Moreover, each source-destination pair at a distance of order 
n^/"^ must transmit their data over order n^/'^d~^{n) many hops, leading to a multi-hop loss of n~^^'^d{n). 
Combining these three factors results in a per-node rate of d^~"{n)n'^^'^. 

The next theorem shows that Theorem |4] is tight under adversarial node placement under a constraint 
on the regularity. The proof of the theorem is presented in Section IXB 

Theorem 5. Under either fast or slow fading, for any a > 3, there exists h4,{n) = 0(log^(ri)), such that 
for any n, and d*{n), there exists a node placement V{n) with minimum separation 1/2 and 1/2-regular 
at resolution d*{n) such that for X{n) chosen uniformly at random from the set of all permutation traffic 
matrices, we have 

p*{n) < b4{n)d* ~'^{n)n~^''^, 

with probability 1 — o{l) as n ^ oo. 

As an example, assume that 

d*{n) =nP 

for some ?7 > 0. Then Theorem |4] shows that for any node placement of regularity d*{n) and a > 3, 



where /3{n) converges to zero as n ^ oo at a rate arbitrarily close to 0(log ^'^(n)). In other words 

„„Jog(p">)) 3_ 

Moreover, by Theorem [5] there exist node placements with same regularity such that for random permu- 
tation traffic with high probability p*{n) is (essentially) of the same order, in the sense that 

\og{p*{n)) 



lim 

n— »oo 



log(n) 



< (3-a)r/-l/2. 



In particular, for r/ = (i.e., regular node placement), and for r] = loglog(?T,)/log(n) (i.e., random node 
placement), we obtain the order ri"^/^ scaling as expected. For r] = 1/2 (i.e., completely irregular node 
placement), we obtain the order n^""/^ scaling as in Theorems \T\ and [3l 

IV. Hierarchical Relaying Scheme 

This section describes the architecture of our hierarchical relaying scheme. On a high level, the 
construction of this scheme is as follows. Consider n nodes V{n) placed arbitrarily on the square region 
A{n) with a minimum separation Tmin. Divide A{n) into squarelets of equal size. Call a squarelet dense, 
if it contains a number of nodes proportional to its area. For each source-destination pair, choose such a 
dense squarelet as a relay, over which it will transmit information (see Figure [T]). 




Fig. L Sketch of one level of the hierarchical relaying scheme. Here {{ui,Wi)}^^i are three source-destination pairs. Groups of source- 
destination pairs relay their traffic over dense squarelets, which contain a number of nodes proportional to their area (shaded). We time 
share between the different dense squarelets used as relays. Within all these relay squarelets the scheme is used recursively to enable joint 
decoding and encoding at each relay. 

Consider now one such relay squarelet and the nodes that are transmitting information over it. If we 
assume for the moment that all the nodes within the same relay squarelet could cooperate then we would 
have a multiple access channel (MAC) between the source nodes and the relay squarelet, where each of 
the source nodes has one transmit antenna, and the relay squarelet (acting as one node) has many receive 
antennas. Between the relay squarelet and the destination nodes, we would have a broadcast channel (BC), 
where each destination node has one receive antenna, and the relay squarelet (acting again as one node) 
has many transmit antennas. The cooperation gain from using this kind of scheme arises from the use of 
multiple antennas for these multiple access and broadcast channels. 

To actually enable this kind of cooperation at the relay squarelet, local communication within the relay 
squarelets is necessary. It can be shown that this local communication problem is actually the same as 
the original problem, but at a smaller scale. Hence we can use the same scheme recursively to solve this 



subproblem. We terminate the recursion after several iterations, at which point we use simple TDMA to 
bootstrap the scheme. 

The construction of the hierarchical relaying scheme is presented in detail in Section IIV-AI A back-of- 
the-envelope calculation of the per-node rate it achieves is presented in Section IIV-BI A detailed analysis 
of the hierarchical relaying scheme is presented in Sections |VI] and IVIIi 

A. Construction 

Recall that 

A(6)4[o,v^]2 

is the square region of area b. The scheme described here assumes that n nodes are placed arbitrarily in 
A{n) with minimum separation rmin G (0, 1). We want to find some rate, say po, that can be supported 
for all n source-destination pairs of a given permutation traffic matrix A(n). The scheme that is described 
below is "recursive" (and hence hierarchical) in the following sense. In order to achieve rate po for n 
nodes in A(n), it will use as a building block a scheme for supporting rate pi for a network of 

A n 
ni = — r-: 
27(n) 

nodes over A(ai) (square of area ai) with 



ai 



A 



n 



7(ra) 

for any permutation traffic matrix A(ni) of rii nodes. Here the branching factor ^(n) is a function such 
that 7(n) — ^ oo as ra — ^ oo. We will optimize over the choice of 7(72) later. The same construction is 
used for the scheme over A(ai), and so on. In general, our scheme does the following at level i > of 
the hierarchy (or recursion). In order to achieve rate pi for any permutation traffic matrix X^n^) over 

He 



nodes in A(a^), with 



A n 



7^(n)' 

use a scheme achieving rate pi+i over n^+i nodes in A{ai+i) for any permutation traffic matrix A(n^+i). 
The recursion is terminated at some level L{n) to be chosen later. 

We now describe how the hierarchy is constructed between levels i and i + I for < i < L{n). 
Each source-destination pair chooses some squarelet as a relay over which it transmits its message. This 
relaying of messages takes place in two phases - a multiple access phase and a broadcast phase. We first 
describe the selection of relay squarelets, then the operation of the network during the multiple access 
and broadcast phases, and finally the termination of the hierarchical construction. 

1) Setting up Relays: Given rin nodes in A{ae), divide the square region A{a^) into 7(77.) equal sized 
squarelets. Denote them by {Ak{a e,j^i)}]}2i- Call a squarelet dense if it contains at least ni/2'^(n) = n^+i 
nodes. In other words, a dense squarelet contains a number of nodes of at least a 1/2^+^ fraction of its area. 
We show that since the nodes in A{ai) have constant minimum separation rmin, a squarelet can contain 
at most 0(a£+i) (i.e. 0(a^/7(n))) nodes, and hence that there are at least 6(2~^7(n)) dense squarelets. 
Each source-destination pair chooses a dense squarelet such that both the source and the destination are 
at a distance ^(^a^+i) from it. We call this dense squarelet the relay of this source-destination pair. We 
show that the relays can be chosen such that each relay squarelet has at most n^+i communication pairs 
that use it as relay, and we assume this worst case in the following discussion. 



2) Multiple Access Phase: Source nodes that are assigned to the same (dense) relay squarelet send their 
messages simultaneously to that relay. We time share between the 0(2~^7(?t,)) different relay squarelets. 
If the nodes in the relay squarelet could cooperate, we would be dealing with a MAC with at most n^+i 
transmitters, each with one antenna, and one receiver with at least n^+i antennas. In order to achieve this 
cooperation, we use a hierarchical (i.e., recursive) construction. For this recursive construction, assume 
that we have access to a communication scheme to transmit data according to a permutation traffic matrix 
A(n£+i) between n^+i nodes located in a square of area a^+i. We now show how this scheme at scale 
a^+i can be used to construct a scheme for scale a^ (see Figure |2l). 
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Fig. 2. Description of the multiple access pliase at level £ in the hierarchy with m = nt+\. The first system block represents the wireless 
channel, connecting source nodes {ui}^^^ with relay nodes {i;i}j^^\ The second system block are quantizers {qi}^!^^ used at the relay 
nodes. The third system block represents using n^+i times the communication scheme at level i? + 1 (organized as n^+i permutation traffic 
matrices {^k{'n,i+i)}if^x^ to "transpose" the matrix of quantized observations {yij}ili\- In other words, before the third system block, 
node vi has access to {yij}jtri^ , and after the third system block, node v\ has access to {yajjii^- The fourth system block are matched 
filters used at the relay nodes. 



Suppose there are n^+i source nodes mi, 



,u 



ni+ 



^ (located anywhere in A(a^)) that relay their message 



--71^+ 



^ (located in the same dense squarelet of area a^+i). Each source 



over the n^+i relay nodes vi, . 
node Ui divides its message bits into n^+i parts of equal length. Denote by Xij the encoded part j of the 
message bits of node Ui (Xij is really a large sequence of channel symbols; to simplify the exposition, we 
shall, however, assume it is only a single symbol). The message parts corresponding to {xjj}"^^^ will be 
relayed over node Vj, as will become clear in the following. Sources {nj}"^^\ transmit {xij}^^^ at time 
j for J G {l,...ne+i}. 

Let ykj be the observed channel output at relay v^ at time j. Note that yi^j depends only on channel 
inputs {xij}^^Y- III order to decode the message parts corresponding to {xij}^^^ at relay node Vj, it 
needs to obtain the observations {Vij}^^^ from all other relay nodes. In other words, all relays need to 
exchange information. For this, each relay Vk quantizes its observation {VkjYj^^i at an appropriate rate K 
independent of n to obtain {y/cj}?!!^- Quantized observation ykj is to be sent from relay Vk to relay Vj. 
Thus, each of the n^+i relay nodes now has a message of size K for every other relay node. 

This communication demand within the relay squarelet can be organized as ra^+i permutation traffic 
matrices {Aj(n£+i)}"f;^^ between the n^+i relay nodes. Note that these relay nodes are located in the same 
square of area a^+i. In other words, we are now faced with the original problem, but at smaller scale 
tti+i. Therefore, using n^+i times the assumed scheme for transmitting according to a permutation traffic 
matrix for n^+i nodes in A{ai+i), relay Vj can obtain all quantized observations {yij}^^\\ Now Vj uses 
n^+i matched filters on {Vijj^lY to obtain estimates {xij}^^Y ^^ {^ij}^=V- ^^ other words, each node Vj 
computeqj 



Xi 



hl...[j] 



Ui,Vkl 



k=i vZ]fcl^«.,-"feb1l 



rVkj 



for every z G {1, . . . , n^+i}. Using these estimates it then decodes the messages corresponding to {xij} 



=1 ■ 



^Note that, since we assume full CSI, node Vj has access to the channel gains {hui,v^ lj]}i.k at any time t > j. In particular, this is the 
case at the time the matched filtering is performed. 
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3) Broadcast Phase: Nodes in the same relay squarelet then send their decoded messages simulta- 
neously to the destination nodes corresponding to this relay. We time share between the different relay 
squarelets. If the nodes in the relay squarelet could cooperate, we would be dealing with a BC with one 
transmitter with at least n^+i antennas and with at most n^+i receivers, each with one antenna. In order 
to achieve this cooperation, a similar hierarchical construction as for the MAC phase is used. As in the 
MAC phase, assume that we have access to a scheme to transmit data according to a permutation traffic 
matrix \{nc^i) between n^+i nodes located in a square of area a^+i. We again use this scheme at scale 
a^+i in the construction of the scheme for scale ae (see Figure |3]). 
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Fig. 3. Description of the broadcast phase at level £ in the hierarchy with m = nt+\. The first system block represents transmit beamforming 
at each of the relay nodes {vi}^^^ . The second system block are quantizers {qi}^^^ used at the relay nodes. The third system block 
represents using n^+i times the communication scheme at level £+1 (organized as n^+i permutation traffic matrices {Afe(n^-|.i)}j,i^^) to 
"transpose" the matrix of quantized beamformed channel symbols {iijji^ii- In other words, before the third system block, node vi has 
access to {xii}^^^ , and after the third system block, node vi has access to {xijjjl^^. The fourth system block is the wireless channel, 
connecting relay nodes {ui},;^^^ with destination nodes {wi}^^^ . 



Suppose there are n^+i relay nodes f i, . . . , w„f+i (located in the same dense squarelet of area a^+i) that 
relay traffic for n^+i destination nodes wi, . . . , Wn(.+i (located anywhere in A{a()). Recall that at the end 
of the MAC phase, each relay node Vj has (assuming decoding was successful) access to parts j of the 
message bits of all source nodes {^t^}^il^ Node Vj re-encodes these parts independently; call {xj^}^!^^ the 
encoded channel symbols (as before, we assume Xij is only a single symbol to simplify exposition). Relay 
node Vj then performs transmit beamforming on {x^}"^^^ for the ra^+i transmit antennas of {vk}^'^l to 
be sent at time T + j (for some appropriately chosen T > not depending on j). Call Xkj the resulting 
channel symbol to be sent from relay node v^. TherQ 



■^kj 



E 



KAT + J] 



^Y.k\K,w\T + j]\ 



-.Xij. 



In order to actually send this channel symbol, relay node v^ needs to obtain Xkj from node Vj. Thus, 
again all relay nodes need to exchange information. 

To enable local cooperation within the relay squarelet, each relay node Vj quantizes its beamformed 
channel symbols {xkj}1'J^l at an appropriate rate A'log(n) with K independent of n to obtain {xkjYk^i ■ 
Now, quantized value Xkj is sent from relay Vj to relay Vk- Thus, each of the n^+i relay nodes now has 
a message of size K\og{n) for every other relay node. 

This communication demand within the relay squarelet can be organized as n^+i permutation traffic 
matrices {\k{nij^i)}^'^l between the n^+i relay nodes. Note that these relay nodes are located in the same 
square of area a^+i. Hence, we are again faced with the original problem, but at smaller scale a^+i. Using 
n^+i times the assumed scheme for transmitting according to a permutation traffic matrix for n^+i nodes 
in A{aij^i), relay Vk can obtain all quantized beamformed channel symbols {xkj}^jt^i ■ Now each Vk sends 
Xkj over the wireless channel at time instance T + j (with T chosen to account for the preceding MAC 
phase and the local cooperation in the BC phase). Call yij the received channel output at destination node 

■*Note that, since we only assume causal CSI, relay node Vj does not actually have access to {/i^^ ,,„. [r+j]}fe,i at the time the beamforming 
is performed. This problem can, however, be circumvented. The details are provided in the proofs (see Lemma [Toll. 
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Wi at time instance T + j. Using yij, destination node Wi can now decode part j of the message bits of 
its source node Ui. 

4) Spatial Re-Use and Termination of Recursion: The scheme does appropriately weighted time- 
division among different levels < £ < L(n). Within any level £ > 1, multiple regions of the original 
square A{n) of area n are being operated in parallel. The details related to the effects of interference 
between different regions operating at the same level of hierarchy are discussed in the proofs. 

The recursive construction terminates at some large enough level L = L(n) (to be chosen later). At 
this scale, we have n^ nodes in area A{aL). A permutation traffic matrix at this level comprises ni 
source-destination pairs. These transmissions are performed using TDMA. Again, multiple regions in the 
original square of area n at level L are active simultaneously. 

B. Achievable Rates 

Here we present a back-of-the-envelope calculation of the per-node rate p^^{n) achievable with the 
hierarchical relaying scheme described in the previous section. The complete proof is stated in Section 
|VII[ We assume throughout that long block codes and corresponding optimal decoders are used for 
transmission. 

Instead of computing the rate achieved by hierarchical relaying, it will be convenient to instead analyze 
its inverse, i.e., the time utilized for transmission of a single message bit from each source to its destination 
under a permutation traffic matrix A(n). Using the hierarchical relaying scheme, each message travels 
through L levels of the hierarchy. Call T£{n) the amount of time spent for the transmission of one 
message bit between each of the ne source-destination pairs at level i in the hierarchy. We compute Ti{n) 
recursively. 

At any level i > I, there are multiple regions of area a^ operating at the same time. Due to the spatial 
re-use, each of these regions gets to transmit a constant fraction of time. It can be shown that the addition 
of interference due to this spatial re-use leads only to a constant loss in achievable rate. Hence the time 
required to send one message bit is only a constant factor higher than the one needed if region A{ae) is 
considered separately. Consider now one such region A(ae). By the time sharing construction, only one of 
its 9(2~^7(n)) dense relay squarelets of area a^+i is active at any given moment. Hence the time required 
to operate all relay squarelets is a 6(2~^7(n)) factor higher than for just one relay squarelet separately. 
Consider now one such relay squarelet, and assume n^+i source nodes in A{ai) communicate each n^+i 
message bits to their respective destination nodes through a MAC phase and BC phase with the help of 
the Ui+i relay nodes in this relay squarelet of area a^+i. 

In the MAC phase, each of the n^+i sources simultaneously sends one bit to each of the n^+i relay 
nodes. The total time for this transmission is composed of two terms. 

i) Transmission of n^+i message bits from each of the ra^+i source nodes to those many relay nodes. 
Since we time share between 6(2~^7(r7,)) relay squarelets, we can transmit with an average power 
constraint of 0(2^'-'7(n)) during the time a relay squarelet is active, and still satisfies the overall 
average power constraint of 1. With this "bursty" transmission strategy, we require a total of 

0(n,^,—^ ) = OK,4V(i-"/^)(n)n"/2-i) (2) 

V 2-^7(n)n^+iy 

channel uses to transmit n^+i bits per source node. The terms on the left-hand side of Q can be 
understood as follows: n^+i is the number of bits to be transmitted; a" is the power loss since 
most nodes communicate over a distance of Q{a/ ); 2~^'y{n) is the average transmit power; n^+i 
is the multiple-antenna gain, since we have that many transmit and receive antennas, 
ii) We show that constant rate quantization of the received observations at the relays is sufficient. 
Hence the n^+i bits for all sources generate 0(n£+i) transmissions at level i + 1 of the hierarchy. 
Therefore, 

0{ni+iTe+i{n)) (3) 
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channel uses are needed to communicate all quantized observations to their respective relay nodes. 
Combining ^ and ([3]), accounting for the factor 2~^'j{n) loss due to time division between relay squarelets, 
we obtain that the transmission time for one message bit from each source to the relay squarelet in the 
MAC phase at level i is 



Next, we compute the number of channel uses per message bit received by the destination nodes in the 

BC phase. Similar to the MAC phase, each of the rii+i relay nodes has n^+i message bits out of which 

one bit is to be transmitted to each of the ra^+i destination nodes. Since there are n^+i relay nodes, each 

destination node receives ra^+i message bits. As before the required transmission time has two components. 

i) Transmission of the encoded and quantized message bits from each of the n^^i relay nodes to 

all other relay nodes at level £ + 1 of the hierarchy. We show that each message bit results in 

0((£ + 1) logn) quantized bits. Therefore, 0(ne+i{i + 1) logn) bits need to be transmitted from 

each relay node. This requires 

0(n,+i(£ + l)logHr,+iH) (5) 

channel uses, 
ii) Transmission of n^^i message bits from the relay nodes to each destination node. As before, we use 
bursty transmission with an average power constraint of 6(2^^7(n)) during the fraction 6(2^7^1 (n)) 
of time each relay squarelet is active (this satisfies the overall average power constraint of 1). Using 
this bursty strategy requires 

channel uses for transmission of n^+i bits per destination node. As in the MAC phase, n^+i in the 

left hand side of ^ can be understood as the number of bits to be transmitted, a" as the power 

loss for communicating over distance 6(a^ ), 2~^7(n) as the average transmit power, and n^+i as 

the multiple- antenna gain. 

Combining ([5]) and ^, accounting for a factor 2~^7(n) loss due to time division between relay squarelets, 

the transmission time for one message bit from the relays to each destination node in the BC phase at 

level i is 

rfC(,,) = o(2Si+^(i-"/2)(n)n°/2-i + (£+ l)logHr,+i(n)). (7) 

From (HI) and ([7]), we obtain the following recursion 

= 0(2Vi-"/2)+i(^)^"/2-i + (£+ l)log(n)r,+i(n)) 

= 0(2^7H^"/'"' + ^logHr,+i(n)), (8) 

where we have used a > 2. This recursion holds for allO < i < L. At level L, we use TDMA among ul 
nodes in region A^ai) with a permutation traffic matrix X^ul). Each of the ni source-destination pairs 
uses the wireless channel for I/ul fraction of the time at power 0{nL), satisfying the average power 
constraint. Assuming the received power is less than 1 for all n (so that we operate in the power limited 
regime), we can achieve a rate of at least l^(a^°' ) between any source-destination pair. Equivalently 

Ti^{n) = 0{aT) 

= 0(n°/27-^"/2^n)) 

= 0(n°/2^-^(n)). (9) 
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Combining ([8]) and ^, we have 

ro(n) = 0(n"/2-^2^7(n) + Llog(n)ri(n)) 



= O (n^/^-i (^i log{n))^ {2^l{n) + n-f'^ (n)) V (10) 

The terai 

is the "loss" factor over the desired order n"/^~^ scaling, and we now choose the branching factor 7(72) 
and the hierarchy depth L = L(n) to make it small. Fix a 6 E (0, 1/2) and set 

L{n) ^ \og^/^-\n), 

7(n)^ni/(^(")+i). 

With this 

(L(n)log(n))''^"^ < n2i°s-^''-'Wi°si°gW^ 

2^(")^(^) < ^log-i/2-^(n)+log^-i/2(n)^ 

n7-^(")H<ni°^"^''('^). 
Since 5 > 0, the n^°^ ('^^ term dominates in (flOl) . and we obtain 

ToH < 6(n)n"/2-\ 

where 

b{n) < n'^(i°s'-^/'W). 

Hence the per-node rate of the hierarchical relaying scheme is lower bounded as 

with 

Note that to minimize the loss term, we should choose 5 > to be small. 

V. Cooperative Multi-Hop Scheme 

In this section, we provide a brief description of the cooperative multi-hop scheme. The details of the 
construction and the analysis of its performance can be found in Section 1X1 

Recall that a node placement V{n) is //-regular at resolution d{n) if every square [id{n), [i + l)d{n)] x 
[jd{n), {j + l)d{n)] for some i,j G N contains at least fj.d'^in) nodes. Given such a node placement V{n), 
divide it into squares of sidelength d{n). Consider four adjacent squares, combined into a bigger square 
of sidelength 2d{n). By the regularity assumption on V{n), this bigger square contains at least Ajj.d'^in) 
nodes. Hence we can apply the hierarchical relaying scheme introduced in the last section to support any 
permutation traffic within this bigger square at a per-node rate of 

6H(d2(n))i-°/2 = 6(n)rf2-"(n), 

where h{n) is essentially of order n~ '°^ "^"^ By properly choosing the permutation traffic matrices within 
every possible such bigger square of sidelength 2d{n), this creates a equivalent communication graph with 
n/d'^{n) nodes each corresponding to a square of sidelength d{n) in A{n), and with edges between nodes 
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corresponding to neighboring squares. With the above communication procedure and appropriate spatial 
re-use, each such edge has a capacity of 

The resulting communication graph is depicted in Figure |4l 



I ri I — I I — r 

I r"~i I — ~i I — ~r 

I — ~i — ~i — ~i — ~i — ~i — ~r 

I r"~i I — ~i I — ~r 

I — ~i — ~i — ~i — ~i — ~i — ~r 

I r"~i I — ~i I — ~r 

I — ~i — ~i — ~i — ~i — ~i — ~r 



Fig. 4. Communication graph (in bold) resulting from the construction of the cooperative multi-hop scheme. The entire square has sidelength 
y/n, and the dashed squares have sidelength d{n). Each (bold) edge in the communication graph corresponds to using the hierarchical relaying 
scheme between the nodes in the adjacent squares of sidelength d{n). 

Now, to send a message from a source node in V{n) to its destination node, we first locate the squares 
of sidelength d{n) they are located in. We then route the message over the edges of the communication 
graph constructed above in a multi-hop fashion. By the construction of the communication graph, each 
such edge is implemented using the hierarchical relaying scheme. In other words, we perform multi- 
hop communication over distance ^/n with hop length d{n), and each such hop is implemented using 
hierarchical relaying over distance d{n). Since each edge in the communication graph has a capacity of 



h[n)d 



'4— a. 



n] 



and has to support roughly n^/'^d{n) source-destination pairs, we obtain a per-node rate of 



p'^^^in) > h{n)d''~''{n)n~^'^d~\n) 



h{n)d 



'3-a, 



nm 



-1/2 



per source-destination pair. 



VI. Analysis of the Hierarchical Relaying Scheme 

In this section, we analyze in detail the hierarchical relaying scheme. Throughout Sections IVI-AI to 
IVI-C[ we consider communication at level £, < £ < L = L{n), of the hierarchy. All constants Ki are 
independent of L 

Recall that at level i, we have a square region A{ai) of area 



containing 



ai 



rii 



A 


n 




Yin) 


A 


n 



nodes V{ni). We divide A{a£) into 7(n) squarelets of area a^+i. Recall that a squarelet of area a^+i in 
level i of the hierarchy is called dense if it contains at least n^+i nodes. We impose a power constraint of 
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Pe{n) = 6(2~^7(n)) during the time any particular relay squarelet is active. Since we time share between 
6(2~^7(n)) relay squarelets, this satisfies the overall average power constraint (by choosing constants 
appropriately). 

Since other regions of area a^ are active at the same time as the one under consideration, we have to 
deal with interference. To this end, we consider a slightly more general noise model that includes the 
experienced interference at the relay squarelets. More precisely, we assume that, for all u E V{ni), the 
additive noise term {2;„[t]}t is independent of the signal {x„[t]}t and of the channel gains {hu,v[t]}v,t', 
that the noise term is stationary and ergodic across time t, but with arbitrary dependence across nodes u; 
and that the noise has zero mean and bounded power Nq independent of n. Note that we do not require 
the additive noise term to be Gaussian. In the above, A'^o accounts for both noise (which has power 1 in 
the original model), as well as interference. We show in Section IVIIl that these assumptions are valid. 

Recall the following choice of 7(72) and L{n): 

L{n)^\og'/'-\n), 

7(n)4nV(^(")+i), ^ ^ 

with 5 E (0, 1/2) independent of n. This choice satisfies 

7(n) < 7(n) if n < n, 

7^(")(ra)<n for all n, (12) 

2"^(")7(n) ^ cx) as 77, ^ 00, 

The first condition in (fT2l) implies that the number of squarelets 7(72) we divide A{n) into increases in 
n. The second condition implies the squarelet area a^n) at the last level of the hierarchy is bigger than 
1. As we shall see, the third condition implies that the number of dense squarelets at the last level (and 
hence at every level) grows unbounded as 77 ^ 00 (see Lemma [6] below). 

Throughout Section |Vll we consider the fast fading channel model. Slow fading is discussed in 
Section [yiFBl 

A. Setting up Relays 

The first lemma states that the minimum-separation requirement rmin G (0; 1) implies that a constant 
fraction of squarelets must be dense. We point out that this is the only consequence of the minimum- 
separation requirement used to prove Theorem \T\ Thus Theorem \T\ remains valid if we just assume that 
Lemma [6] below holds directly. See also Section IXH-DI for further details. 

Lemma 6. For any V{n£) C A{ai) with \V{ni)\ > rii and with minimum separation rmin E (0, 1), each 
of its squarelets of area a^+i contains at most Ji^i 0^/7(77) nodes, and there are at least K22~^-f{n) dense 
squarelets. 

Proof. Put a circle of radius rmin/2 around each node. By the minimum-separation requirement, these 
circles do not intersect. Each node covers an area of ivr^^^/A. Increasing the sidelength of each squarelet 
by r-min, this provides a total area of 



7(77) 



,2 



in which the circles around these nodes are packed. Here we have used that 7^^^ (77) < 77 by ([T2)) . and 
therefore 

7(72) < 72/7^(77) = ai. 

Hence there can be at most Kiai/'y{n) nodes per squarelet with 



j^^ 44(1 + '^™'^) 



min 
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Note that, since rmin < 1, we have Ki > I. 

Let d{ni) be the number of dense squarelets in A{a£), and therefore ^{n) — d{ni) is the number of 
squarelets that are not dense. By the argument in the last paragraph, each dense squarelet contains at most 
Kiai/'j{n) nodes, and those squarelets that are not dense contain less than ni+i nodes by the definition 
of dense squarelets. Hence d{ne) must satisfy 

d{n()Kiai,/^{n) + (7(72) - d{ni))ni+i > \V{ni)\ > m. 

Thus, using ae = 2^ne, n^+i = ne/2'-f{n), we have 

d{ne)Ki2' + (7(n) - d{ne))/2 > -f{n). 

As Ki2^ > 1, this yields 

with 



1- 

Ki2^ 


1/2 , ^ 
- l/2^(-) ^ 


2-' 
2Ki 


-7(n 




K,^-^ 


1 





2Ki 

D 

Consider V{ni) C A{ai) with |y(n£)|, and choose arbitrary K22~^'j{n) dense squarelets of area a^+i 
(as guaranteed by Lemma [6l). Call those squarelets {Ak{ae+i)} f,^^ "'^"'' . For each sour-destination pair, 
we now select one such dense squarelet to relay traffic over. To avoid bottlenecks, this selection has to be 
done such that all relay squarelets carry approximately the same amount of traffic. Moreover, for technical 
reasons, the distances from the source and the destination to the relay squarelet cannot be too small. 

Formally, the selection of relay squarelets can be described by the schedules S E {0, i}"f><^'22 7{n) 
with Su,k = 1 if source node u relays traffic over dense squarelet k, and S E {0,1}^^^ 'yin)xne ^[^\y 
Sk,w = 1 if destination node w receives traffic from dense squarelet k. With slight abuse of notation, let 
ru,A^{ae+i) be the distance between node u E V{ni) and the closest point in y4fc(a£+i), i.e.. 



A 



ru,Ak{ae+i) = "^in r^^y. (13) 

veAk(ae+-i) 



Define the sets 

< YZU Su,k < ni+i V/c, 

< E£r''^"^^.,fc < 1 V«, 

Su,k = 1 implies r„,A,(a,+i) > \^2a^ Vu, kj (14) 

and _ _ _ 

The sets S{ni) and S{ni) are the collection of schedules satisfying the conditions mentioned in the last 
paragraph. More precisely, the first condition in (fT4l) ensures that at most n^^i source-destination pairs 
relay over the same dense squarelet, the second condition ensures that each source-destination pair chooses 
at most one relay squarelet, and the third condition ensures that sources and destinations are at least at 
distance ^/2ai+l from the chosen relay squarelet. 

Next, we prove that any node placement that satisfies Lemma [6] allows for a decomposition of any 
permutation traffic matrix A(n^) into a small number of schedules belonging to S{ni) and S{ni). 
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Lemma 7. There exist K^ such that for all n large enough (independent ofi), and every permutation traffic 
matrix A(n^) G {0,1}"^^"^ we can find K^2^ schedules {5(')(nf)}£'f C S{ni), {5«(^£)}£'f C 5(n^) 
satisfying 

X32* 

j=i 

Proof Pick an arbitrary source-destination pair in \{n^), and consider the squarelets containing the source 
and the destination node. Since each squarelet has side length ^Jal^, there are at most 50 squarelets at 
distance less than y^Sa^^i from either of those two squarelets. As 2~^*^"^7(n) — > 00 as n ^ 00 by (fT2l) . 
there exists K (independent of t) such that for n > K we have 50 < if22~^~^7(n). Since there are at 
least A'22~^7(n) dense squarelets by Lemma [6l there must exist at least A'22^^~^7(n) dense squarelets 
that are at distance at least ^/2al^ from both the squarelets containing the source and the destination 
node. 

In order to construct a decomposition of \{ni), we use the following procedure. Sequentially, each of 
the ni source-destination pairs chooses one of the (at least) K22~^~^'j{n) dense squarelets at distance 
at least ^/2ag^ that has not already been chosen by n^+i other pairs. If any source-destination pair can 
not select such a squarelet, then stop the procedure and use the source-destination pairs matched with 
dense squarelets so far to define matrices S^^\n£) and S^^\ni). Now, remove all the matched source- 
destination pairs, forget that dense squarelets were matched to any source-destination pair and redo the 
above procedure, going through the remaining source-destination pairs. 

Let 

K, ^ A/K2. 

We claim that by repeating this process of generating matrices S^'^\ni) and S^^\ni), we can match 
all source-destination pairs to some dense squarelet with at most /i32^ such matrices. Indeed, a new 
pair of matrices is generated only when a source-destination pair can not be matched to any of its 
available (at least) K22~^~^'j(n) dense squarelets. If this happens, all these dense squarelets are matched 
by n^+i = ni/2'y{n) pairs. Hence at least K22~^~'^ni source-destination pairs are matched in each "round". 
Since there are n^ total pairs, we need at most 

^^ _ r^ 2' 

A 22 * -^Ui 

matrices S^^^ni) and S^'^^ni). D 

For a permutation traffic matrix \{ni), communication proceeds as follows. Write 

Ks2' 



AM = 5^5»M5«M 



as in Lemma |71 Split time into A'32^ equal length time slots. In slot i, we use S^'^\ni)S^'^\ni) as our 
traffic matrix. Consider without loss of generality i = 1 in the following. Write 

k=l 

where S^^'''\ng^i)S^^'''\ni+i) is the traffic relayed over the dense squarelet ^^(a^+i). We time share 
between the schedules for /c e {1, . . . , A'22~^7(n)}. Consider now any such k. In the worst case, there are 
exactly n^+i communication pairs to be relayed over ylfc(a£+i), and the relay squarelet A^ia^^i) contains 
exactly ng+i nodes. We shall assume this worst case in the following. _ 

We focus on the transmission according to the traffic matrix S'*^^'^^(n^+i)S^(^'^)(n^4.i). Let V{n^+i) be the 
nodes in Ai{ae+i), and let t/(n£+i) and W{ne+i) be the source and destination nodes of S^^'^\ne+i)S^^'^\ne+i] 
respectively. In other words, the source nodes t/(n£+i) communicate to their respective destination nodes 
W{ni^i) using the nodes V{ni^i) as relays. 
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B. Multiple Access Phase 

Each source node in [/(ra^+i) splits its message into n^+i equal length parts. Part j at every node 
u E U{ni+i) is to be relayed over the j-th node in V{ni+i). Each part is separately encoded at the source 
and separately decoded at the destination. After the source nodes are done transmitting their messages, the 
nodes in the relay squarelet quantize their (sampled) observations corresponding to part j and communicate 
the quantized values to the j-th node in the relay squarelet. This node then decodes the j-th message 
parts of all source nodes. Note that this induces a uniform traffic pattern between the nodes in the relay 
squarelet, i.e., every node needs to transmit quantized observations to every other node. While this traffic 
pattern does not correspond to a permutation traffic matrix, it can be written as a sum of n^+i permutation 
traffic matrices. A fraction 1/n^+i of the traffic within the relay squarelet is transmitted according to each 
of these permutation traffic matrices. This setup is depicted in Figure [21 in Section HV-AI 

Assuming for the moment that we have a scheme to send the quantized observations to the dedicated 
node in the relay squarelet, the traffic matrix S^^'^\ni^i) between [/(n^+i) and V{ni^i) describes then a 
MAC with n^+i transmitters, each with one antenna, and one receiver with n^+i antennas. We call this the 
MAC induced by S^^'^^n^+i) in the following. Before we analyze the rate achievable over this induced 
MAC, we need an auxiliary result on quantized channels. 
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Fig. 5. Sketch of the quantized channel. / and tp are the channel encoder and decoder, respectively; {g^jJILj^ are quantizers; Py\x and 
Pj;\y represent stationary ergodic channels with the indicated marginal distributions. 



Consider the quantized channel in Figure [51 Here, / is the channel encoder, if the channel decoder, 
{^fcjfeLi quantizers. All these have to be chosen. Py\x and Px\y, on the other hand, represent fixed stationary 
ergodic channels with the indicated marginal distributions. We call R the rate of the channel code (/, ip) 
and {Rk}^^i the rates of quantizers {g^}^]^. 

Lemma 8. If there exist distributions P^ and {Py,,\yk}T=i ■^"'^^ ^^^^ -^ < -^(^j ^) ^"^ -^fc > -^(z/fcj ^fc), V/c, 
then (/?, {i?fc}fcLi) i^ achievable over the quantized channel. 

Proof. The proof follows from a simple extension of Theorem 1 in Appendix II of [8]. D 

Lemma 9. Let the additive noise {zv}vi^v{ni+i) be uncorrelated (over v). For the MAC induced by 
S^^'^\ni^i) with per-node average power constraint Pe{n) < njha^' , a rate of 



pri 



n) > K4^Pi{n)ni,+ia^ 



■a/2 



per source node is achievable, and the number of bits required at each relay node to quantize the 
observations is at most K^ bits per n^+i total message bit^ sent by the source nodes. 



-1 ^Q/2 



X -a/2 , 



Proof. The source nodes send signals with a power of (essentially) n^^a^ for a fraction Pt 
1 of time and are silent for the remaining time. To ensure that interference is uniform, the time slots during 
which the nodes send signals are chosen randomly as follows. Generate independently for each region 
A{ai) a Bernoulli process {i?[t]}tgN with parameter P^(n)ra^+ia7 /(I + rf) <l for some small 77 > 0. 
The nodes in A{ai) are active whenever B\t] = 1 and remain silent otherwise. Since the blocklength 
of the codes used is assumed to be large, this satisfies the average power constraint of Pi{n) with high 
probability for any 77 > 0. Since we are interested only in the scaling of capacity, we ignore the additional 



Total message bits refers to the sum of all message bits transmitted by the n^+i source nodes. 
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1/(1 + ?7) term in the following to simplify notation. Clearly, we only need to consider the fraction of 
time during which B[t] = 1. 

Let y be the received vector at the relay squarelet, y the (componentwise) quantized observations. We 
use a matched filter at the relay squarelet, i.e., 

" II I II "' 

where column vector h^ = {hu,v}vev{ne+i) are the channel gains between node u E t/(nf+i) and the 
nodes in the relay squarelet V{ni+i). The use of a matched filter is possible since we assume full CSI is 
available at all the nodes. 

We now use Lemma[8]to show that we can design quantizers {qv}vev{ni+i) of constant rate and achieve 
a per-node communication rate of at least K4P£{n)ne+iaJ'^ . The first channel in Lemma[8](see Figure[5]) 
will correspond to the wireless channel between a source node n and its relay squarelet V{ne+i). The 
second "channel" in Lemma [8] will correspond to the matched filter used at the relay squarelet. To apply 
Lemma m we need to find a distribution for x„ and for yv\yv Define 

~ A 



ru = ru,Ai{ai+i)/V^e < 1 

with r„,Ai{af+i) as in (fT3l) . to be the normalized distance of the source node u E f/(n£+i) to the relay 
squarelet y4i(a^+i). For each u E U{ni>+i) let x„ ~ A/c(0,f"n7^;^a" ) independent of Xu for u ^ u, 
and let fjv = Vv + Zy for z^ ~ A/c(0, A^) independent of y and for some A^ > 0. Note that the channel 
input Xu has power that depends on the normalized distance f^ (i.e., only nodes u E f/(n^+i) that are at 
maximal distance a/So^ from the relay squarelet transmit at full available power). This is to ensure that 
all signals are received at roughly the same strength by the relays. 

We proceed by computing the mutual informations I{yv',yv\{hu,v}) and I{xu;Xu\{hu^i;}) as required 
in Lemma [8] (the conditioning on {huy} being due to the availability of full CSI). Note first that by 
construction of S^^'^\ni+i) (see (fT4l)). we have for u E U{n£+i) and v E V"(n^+i) 

and hence 

< — < -^- (15) 



From this, and since |/i„,t,p = r~", we obtain 

2-3a/2 -a/2 ^ ,, pf" < o-a/2 -a/2 
9-3q/2 ^-°^/2 ^ \\u \\2~a ^ 9-0/2^ „-'^l'^ 

We start by computing /(y^,; yv\{hu^v})- We have 

i/u / ^ '^u,vXu I ^V ' ^VJ 

and hence y^ has mean zero and variance 

HlVvf) = Yl IhuA^Knjl^a!;'^ + iVo + A^ 

nG(7{nf+l) 
, o— a/2 —a/2 _l a/2 .at , a2 



(16) 
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where we have used (fT6l ). Hence 

Hyv;yv\{hu,v}) = h{y^\{hu^ii}) - /i(y^|?/^,{/i,s,c}) 

< log (27reE(|y,n) - log(27reA2) 

< log (27re(2^"/2 + No + A'^)) - log(27reA^ 



= log(l + ^^). (17) 



We now compute I{xu;Xu\{hj:,^i,}). We have 

x=\\h\\x+ V ^"^" x- + ^" fz + i) 

ueC/(n^+i)\{M} 

Conditioned on {^ijsec/K+i), 

||h„||x„ ~ A/'c(0, ||/i„|pf"n7_^\a°'^^), 
and 



E 



h\h. hi 



\hih,- 



12 



Me(7(nf+i)\|u| 



where we have used the assumption that {zi,}y^vini+i) are uncorrelated in the second line. Using (fT6l) . 
this is, in turn, upper bounded by 

ueU{ni+i)\{u} 

Similarly, we can lower bound the received signal power as 

E{\\Kr\x^\') > 2-3-/2. 

Since Gaussian noise is the worst additive noise under a power constraint [15], and applying Jensen's 
inequality to the convex function log(l + 1/x), we obtain 

/ / 2-3"/2 

I(xu\Xu\\huA)>K\ log I 1 H 7 

\ \ ^ ' 'u'''£+l"'e l^ueU{ni+i)\{u}'u\'''-^'''->J'\ -h JVo -h ^ 

/ 2-3"/2 \ 

" ^°^ V ^ 23"/2r>7^2^a," E.ef/K,o\W ^^H\hih^) + N^ + A^ ) ' ^^^^ 

We have for u ^ u, 

E{\hih^\') = Eihih^h^,K) 

v£V{rn+{) 

and hence using (fTSl) 



tieC/(n^+i)\{-u} ueC/(n£+i)\{«} iieV{nf+i) 



\ Z ^ If _i_-l Oj D 
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Therefore we can continue (1181 ) as 

1 / 2^'^°/^ 



I{Xu.iu\{KA) > ^log (1 + 2a/2 Ino + A^ ' " ^'- ^^^^ 



Using (fTTI) and (|20|) in Lemma M, and observing that we only communicate during a fraction 

P^(?i)n£+ia7°'^ < 1 
of time yields a per source node rate p^^^(ra) arbitrarily close to 

and a quantizer of rate arbitrarily close to 

log 1 + 



A2 

bits per observation at each relay node. Since by (l20l) mutual information /(x„; x„|{/i{i,{j}) is at least K^ 
for every m e t/(n£+i) during the fraction of time we actually communicate, this implies that there are 
at most I/K4 observations at each relay node per ri^+i total message bits. Thus the number of bits per 
relay node required to quantize the observations is at most 

^^ = ^^"H^+ A^ 
bits per n^+i total message bits sent by the source nodes. D 

C. Broadcast Phase 

At the end of the MAC phase, each node in the relay squarelet received a part of the message sent 
by each source node. In the BC phase, each node in the relay squarelet encodes these messages together 
for Ui+i transmit antennas. The encoded message is then quantized and communicated to all the nodes 
in the relay squarelet. These nodes then send the quantized encoded message to the destination nodes 
W{ni+i). Note that this again induces a uniform traffic pattern between the nodes in the relay squarelet, 
i.e., every node needs to transmit quantized encoded messages to every other node. While this traffic 
pattern does not correspond to a permutation traffic matrix it can be written as a sum of ra^+i permutation 
traffic matrices. A fraction 1/n^+i of the traffic within the relay squarelet is transmitted according to each 
of these permutation traffic matrices. This setup is depicted in Figure |3] in Section HV-AI 

Assuming for the moment that we have a scheme to send the quantized encoded messages to the 
corresponding nodes in the relay squarelet, the traffic matrix S^^'^\ni+i) between V{ni+i) and W{ni+i) 
describes then a BC with one transmitter with n^+i antennas and n^+i receivers, each with one antenna. 
We call this the BC induced by S^^'^\ni^i) in the following. 

Lemma 10. For the BC induced by S'^^''^^ (^^£+1) with per-node average power constraint Pe{n) < nj^^a^' , 
a rate of 

is achievable per destination node, and the number of bits required to quantize the observations is at 
most Kj[l + 1) log(r2) bits at each relay node per n^+i total message bits^ received by the destination 
nodes. 

Proof. Consider a node v E V{n£+i) in the relay squarelet, say the first one. From the MAC phase, 
this node received the first part of the messages of each source node u E U{ni+i). We would like to 
jointly encode these message parts at the relay node using transmit beamforming, and then transmit the 

*Total message bits refers to the sum of all message bits received by the n^+i destination nodes. 
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corresponding encoded signal using all the nodes in the relay squarelet. However, this cannot be done 
directly, because at the encoding time, the future channel state at transmission time is unknown. 

We circumvent this problem by reordering the signals to be transmitted at the relay nodes as follows. 
Let 

^ 2 

{(^v,w}vev{ni+i),wew{ni+i) G {0, 7r/2, TT, 37r/2}"'«+i. 

be a "quantized" channel state. The part of the messages at node one in the relay squarelet is encoded 
for rii+i transmit nodes with an assumed channel gain of 



"'V,W['^\ I V.W 



exp( 



where the {9v,ui[t]}v,w,t are cycled as a function of t through all possible values in {0, 7r/2, vr, 37r/2}"'^+i. 
The components of the encoded messages are then quantized and each component sent to the corresponding 
node in the relay squarelet. Once all nodes in the relay squarelet have received the encoded message, 
they send in each time slot a sample of the encoded messages corresponding to the quantized channel 
state closest (in Euclidean distance) to the actual channel realization in that time slot. By ergodicity of 
{6'«,i,[t]}t, each quantized channel state is used approximately the same number of times. More precisely, 
as the message length grows to infinity, we can send samples of the encoded message parts a 1/(1 + 77) 
fraction of time with probability approaching 1 for any 77 > 0. Since we have no constraint on the encoding 
delay in our setup, we can choose rj arbitrarily small, and given that we are only interested in scaling laws, 
we will ignore this term in the following to simplify notation. Note that the destination nodes can reorder 
the received samples since we assume full CSI. In the following, we let {^„,^}u,u, be the random quantized 
channel state induced by {9y^w}v,w through the above procedure. Denote by {/i„,«;}t,,^ the corresponding 
channel gains. 

As in the MAC phase, the nodes in the relay squarelet send signals at a power (essentially) n^^a^ 
a fraction Pi{n)ne^iaj" < 1 of time and are silent for the remaining time. To create interference at 
uniform power, this is done in the same randomized manner as in the MAC phase. Generate independently 
for each region A{ai) a Bernoulli process {i?[t]}iGN with parameter Pi{n)n£j_iaJ"' /(I + rj) for some 
small 1] > 0. The nodes in A^a^) are active whenever B[t] = 1 and remain silent otherwise. As before, we 
ignore the additional 1/(1 + 77) term. Again we only need to consider the fraction of time during which 
B[t] = 1. 

Consider the message part at a relay node for destination node iv E W{ne+i). We encode this part 
independently; call x^ the encoded message part. The relay node then performs transmit beamforming to 
construct the encoded message for all its destination nodes, i.e., 

III II •^'"J' 

where row vector h^, = {hv,w}vev(ni+i) contains the channel gains to node w, and where we have 
used \h^^w\ = |/ii,,w|. The relay node then quantizes the vector of encoded messages componentwise and 
forwards the quantized version x to the other nodes in the relay squarelet. These nodes then send x over 
the channel to the destination nodes. The received signal at destination node w is thus 

With this, we have the setup considered in Lemma[8](with different variable names). The first "channel" 
in Lemma [8] (see Figure [5]) will correspond to the transmit beamforming used at the relay squarelet. The 
second channel in Lemma [8] will now correspond to the wireless channel between the relay squarelet 
V{ni^i) and a destination node w. To apply Lemma [8l we need to find a distribution for x^ and for 
Xy\xy. We also need to guarantee that x^ satisfies the power constraint at each node v in the relay squarelet. 
For each w e W{ni+i) let x^ ~ A/c(0, Kn^^^a^' ) (for some K to be chosen later) independent of x^ 
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for w ^ w, and let x^ = x^ + z^ for 5„ ~ A/c(0, A^) independent of x and for some A^ > 0. We then 
have 

Vw II I II ■^UI T / II 1 II •'^w T flw^ ~r 2^1«- 

n 1^ ^ n - 

We proceed by computing the mutual informations I{xv;Xv\{hu^i,}) and I {xw] yw\{hu,v}) as required 
in Lemma [U (the conditioning in {/iu,{)} again being due to the availability of full CSI). Note first that by 
construction of S'*^^'^)(n^+i), we have for any w G W{ng^i) 

2 min r^ „, > max r„^^, 

and therefore 

|/^^,^P ^ (min^6y(n,+i)r„,^) ^ ^ 

II '^^'P ra£+i(max^ev'(nf+i)'^i;,t«) " ^^^+1 
We start by computing I{xv;Xv\{hu^i,}). x^, has mean zero and variance 



'^l /* ~ Z^ II h 112"^^ "-f+l"^ 

Oct 



< rif+i Krif^^^a^ + A' 

< n7_^\a/ ^, (22) 

for 

^ A2-"(1_A2), 

which is positive for A^ < 1, and where we have used (|2TI) and that 

by (fT2l) . Equation (l22l) shows that x^, satisfies the power constraint of node v in the relay squarelet V^(n£_|_i). 
Moreover, we obtain 

< log ('27reE(|£^|2)') - log(27reA2) 

+r 

A' 

It remains to compute I{xw]yw\{hu,v})- Note that the encoding procedure guarantees that 

cos(7r/4)2||/i^f < \Kh)S < WKt- 
Moreover, fox w ^^ w, 

/ J '^yi'T'vwi I '''?;«) I ) 

veV{ni+i) 



^e+i^e 



<log(^±^^). (23) 
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From this, we get by a similar argument as in Lemma |9] that 

H^w] yw\{hu,v}) > Kq. (24) 

Using (l23l) and (|24)) in Lemma [8l and observing that we only communicate during a fraction 

of time, yields a per destination node rate p^'^{n) arbitrarily close to 

KQPi{n)ni+ia^'^''^ 
bits per channel use and a quantizer rate arbitrarily close to 

-1 a/2 

bits per encoded sample. Since by ((24)) mutual information /(x«,; ytu|{/iu,ti}) is at least Kq for every 
w E W{ni^i) during the fraction of time we actually communicate, this implies that there are at most 
1/Kq encoded message samples for each relay node per n^+i total message bits received by the destination 
nodes W{ni+i). Thus the number of bits required at each relay node to quantize the encoded message 
samples is at most 



1 -1 "/2 



<i[iog(i,2-v/^; 



<i^7(^+l)log(n) 

bits per n^+i total message bits received by the destination nodes, and where we have used 7(n) <n by 
(fH. D 



VIL Proof of Theorem [H 

The proof of Theorem [T] is split into two parts. In Section Ivn-Al we prove the theorem for fast fading, 
and in Section IVH-BI for slow fading. 

A. Fast Fading 

In this section, we prove Theorem [T] under fast fading, i.e., {6'u_i,[t]}i is stationary and ergodic in t. 
We first prove that the assumptions on the power constraint and the interference made in Section IVTl (see 
Lemmas |9] and (TO]) during the analysis of one level of the hierarchical relaying scheme are valid. We then 
use the results proved there to analyze the behavior of the entire hierarchy, yielding a lower bound on the 
per-node rate achievable with hierarchical relaying. 

We first argue that the constraint Pi{n) < nj^^a'^ needed in Lemmas |9] and [TOl is satisfied. Consider 
the hierarchical relaying scheme as described in Section |IV] and fix a level i, < i < L = L{n) in this 
hierarchy. At level i we have a square of area ai = n/7^(n), with n^ = n/2^7^(n) source-destination 
pairs. Since we are time sharing between K22~^'y{n) relay squarelets at this level, we have an average 
power constraint of 

P,{n) ^ i^22-%(n) 
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during the time any particular relay squarelet is active. Since a > 2 and since wy'^^^^n) ^ oo as 
ra ^ oo, we have, for n large enough (independent of £), that 

P,(n) = i^22-S(n) 

. . ., n \«/2-i 






n^^^a 



1 a/2 



Therefore the power constraint in Lemmas |9] and \T0\ is satisfied. 

We continue by analyzing the interference caused by spatial re-use. Recall that the MAC and BC phases 
at level £ induce permutation traffic within the dense squarelets at level i + I. The permutation traffic 
within those dense squarelets at level £ + 1 is transmitted in parallel with spatial re-use. We now describe 
in detail how this spatial re-use is performed. Partition the squarelets of area a^+i (i.e., at level i + 1) 
into four subsets such that in each subset all squarelets are at distance at least ^/al^ from each other. 
The traffic that the MAC and BC phases at level £ induce in each of the relay squarelets at level i + 1 is 
transmitted simultaneously within all relay squarelets in the same subset. Consider now one such subset. 
We show that at any relay squarelet the interference from other relay squarelets in the same subset is 
stationary and ergodic within each phase, additive (i.e., independent of the signals and channel gains in 
this relay squarelet), and of bounded power A^o ~ 1 independent of n. 

We first argue that the interference is stationary and ergodic within each phase. Note first that on 
any level £ + 1 in the hierarchy, all relay squarelets are either simultaneously in the MAC phase or 
simultaneously in the BC phase. Furthermore, all relay squarelets are also synchronized for transmissions 
within each of these phases (recall that the induced traffic in level £+ 1 is uniform and is sent sequentially 
as permutation traffic). Hence it suffices to show that the interference generated by either the MAC or 
BC induced by some permutation traffic matrix is stationary and ergodic. Since all codebooks for either 
of these cases are generated as i.i.d. Gaussian multiplied by a Bernoulli process, and in the BC phase 
beamformed for stationary and ergodic fading, this is indeed the case. 

The additivity of the interference follows easily for the MAC phase, since codebooks are generated 
independently of the channel realization in this case. Moreover, since the channel gains are independent 
from each other and all codebooks are generated as independent zero mean processes, the interference 
in the MAC phase is also uncorrelated (over space) within each relay squarelet. For the BC phase, 
the codebook depends only on the channel gains within each relay squarelet at level i + I. Since the 
channel gains within relay squarelets are independent of the channel gains between relay squarelets, this 
interference is additive as well. 

We now bound the interference power. Note that by the randomized time-sharing construction within 
the MAC and BC phases (see Lemmas [9] and [TOl) . in each relay squarelet, at most n^^i nodes transmit at 
an average power of 1. In the MAC phase, all nodes use independently generated codebooks with power 
at most 1, and thus the received interference power from another relay squarelet at distance iy/oe+i is at 
most 

Ui+ii a^+i -I 2' \Y+\n)) - ' 

by (fT2)) . In the BC phase, the nodes in each active relay squarelet use beamforming to transmit to nodes 
within their own squarelet. Since the channel gains within a relay squarelet are independent of the channel 
gains between relay squarelets, the same calculation as in (fT9l) shows that we can upper bound the received 
interference power from another relay squarelet at distance z^a^+i by 

■~a —a/2 ^ -—a 

in the BC phase as well. 
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Now, by the way in which we perform spatial re-use, every active relay squarelet has at most 8i active 
relay squarelets at distance at least iy/al^. Hence the total interference power received at an active relay 
squarelet is at most 

CO 

J2 8«2"z-" = A^o - 1< oo 

since a > 2. With this, we have shown that the interference term has the properties required for Lemmas [9] 
and \W\ to apply. 

We now apply those two lemmas to obtain a lower bound on the rate achievable with hierarchical 
relaying. Call T£{n) the number of channel uses to transmit one bit from each of ne source nodes to the 
corresponding destination nodes at level i. Lemma |7] states that for n large enough (independent of i), 
we relay over each dense squarelet at most /i32^ times. Combining this with Lemma |9l we see that to 
transmit one bit from each source to its destination at this level we need at most 

AK^2^K^2-^-y(n) ^ n"^ a^^^ = ^^2^^+^ a/2-1 i+^(i-«/2) / n 

4A32A22 7W^^p^^^^^,+ia, K, ^ ^ ^ 

channel uses for the MAC phase. Here, the factor 4 accounts for the spatial re-use, A'32^ accounts for 
relaying over the same relay squarelets multiple times, A'22"^7(n) accounts for time sharing between the 
relay squarelets, and the last term accounts for the time required to communicate over the MAC. Similarly, 
combining Lemmas |7] and [lOl we need at most 

^3^ a/2-l l+ea-a/2) / N 

channel uses for the BC phase. Moreover, at level £ + 1 in the hierarchy this induces a per-node traffic 
demand of at most A'5 bits from the MAC phase, and at most Kf{i+ 1) log(ra) from the BC phase. Thus 
we obtain the following recursion 

r,{n) < 8Ks (-^ + -^) n-/'-'^{n) {4^'-^/\n)Y + {K, + K,{i + 1) log{n))T,^,{n) 

< Kn"^^-^-f{n)A^ + K{i + 1) log(n)r^+i(n) 

< Kn"/2-l^(r^)4^ + KL log(n)rf+i (n) (25) 

for positive constants K, K independent of n and L 

We use TDMA at scale a^ with ul nodes and source-destination pairs. Time sharing between all source- 
destination pairs, we have (during the time we communicate for each node) an average power constraint 

1/2 
of ul. Since at this level we communicate over a distance of at most 2a/ , we have 

rdn) < UL log^' f 1 + ''\,, ) . (26) 

Since 

^L%"^^ < nial^ = 2-^ ^0 

as ra ^ 00, we can upper bound (|26|) as 

TL{n) < K'al^' 

= ir'n"/2^-^"/2^n) 

< K'n''/^-f-^{n) (27) 

for some constant K'. 
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Now, using the recursion (|25l) L times, and combining with (1271 ), we obtain 

ro(n) < ^n"/2-i^(^)4L ^ j^Llog(n)ri(n) 

< ... 

+ {KL\og{n))\L{n) 

< n"/2-i(Ki:iog(n))^('i^4^7(n) + K'n-f-^in)). (28) 

Using the definition of 7(n) and L = L{n) in (fTTl) . we have for n large enough 

(i^L(n)log(n))''^"^ < ^2iog-V2-*(„)iogiogH^ 

4^(")7(ra) < ^21og-i/2-«(n)+log*-i/2(n)^ 

Since 5 > 0, the ra^°^ ("^ term dominates in (|28l) , and we obtain 

ro(n) < 6(n)n"/2"\ 



where 

as n — > CX3. Therefore 

with 



6(n) < n^(i°s^-^/^W), 
p*(n) > p^"(n) = l/ro(n) > 6(n)n^-°/2, 



concluding the proof for the fast fading case. 

B. Slow Fading 

In this section, we prove Theorem [T] under slow fading, i.e., {6'„_„[t]}( is constant as a function of t. 
We sketch the necessary modifications for the scheme described in Section |IV] to achieve a per-node rate 
of at least 6(n)n^~"/^ in the slow fading case. 

Consider level £, < ^ < L{n) in the hierarchy. Instead of relaying the message of a source-destination 
pair over one relay squarelet as in the scheme described in Section |IVl we relay the message over many 
dense squarelets that are at least at distance ^J2al^ from both the source and the destination nodes. We 
time share between the different relays. The idea here is that the wireless channel between any node 
and its relay squarelet might be in a bad state due to the slow fading, making communication over this 
relay squarelet impossible. Averaged over many relay squarelets, however, we get essentially the same 
performance as in the fast fading case. 

We first state a (somewhat weaker) version of Lemma |7l appropriate for this setup. Consider again 
the collection of schedules S{n() and S{n() satisfying the conditions that no relay squarelet is selected 
by more than n^+i source-destination pairs and that all sources and destinations are at least at distance 
^J2ai+l from their relay squarelet (see Section IVI-AI for the formal definition). The next lemma shows 
that for each source-destination pair, we can find /i22^^^^7(n) distinct relay squarelets satisfying the 
above conditions (the requirement that these relay squarelets are distinct is expressed by the orthogonality 
condition of the schedules in Lemma [TT] below). 
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Lemma 11. For every n large enough (independent of i) and every permutation traffic matrix X{ni) G 
|Q^^|n,xn, ^^^j.^ ^^g schedules {5(*)(n^)},^f "'^'^"^ C 5(n^), {S^'^ne)}^^^ '^'^""^ C S{ne) satisfying 

where {S^'^\ni)}i, {S^^\ni)}i are collections of orthogonal matrices in the sense that for i ^ i', 



E 



s^'^ s^'^ - 



u,k ^29) 



^k,u^k,u ~ ^■ 



k,u 

Proof The proof is similar to that of Lemma |71 In order to construct {S^'^^rii)} and {S^'^^rii)}, consider 
the sequential pass over all n source-destination pairs (assume n is large enough for Lemma |7] to hold). As 
before, for each source-destination pair, there are A'22~^~^7(n) dense relay squarelets that are at distance 
at least y/2ai^. Each pair chooses all of these K22~^~^'j{n) squarelets, instead of just one as before. 
Stop one round of this procedure as soon as any of the relay squarelets is chosen by n^+i pairs. Since by 
the end of one round at least one relay squarelet is matched by n^+i source-destination pairs, there are at 
most n^/n^+i = 27(72) such rounds. 

Consider now the result of one such round. We construct K22~^~^^{n) matrices S^^^nc) and S^^\ne), 
with the i-th pair of matrices describing communication over the z-th relay squarelets chosen by source- 
destination pairs matched in this round. Thus, this process produces a total of 27(n)ii'22~^~^7(n) = 
K22"^'j'^{n) such matrices. The orthogonality property follows since each source-destination pair relays 
over the same relay squarelet only once. D 

Given a decomposition of the scaled traffic matrix K22~^~^^{n)\{n) into K22~^'j'^{n) matrices, each 
source-destination pair tries to relay over A'22^^~^7(n) dense squarelets. We time share between these 
relay squarelets. Since each source-destination pair relays only a (i^22~^~^7(n))~^ fraction of traffic over 
any of its relay squarelets, the loss due to this time sharing is now 

i^22-V(n) 
A22 * 7(n) 

as opposed to K-j2^ in Lemma|71 In other words, the loss is at most a factor 27(77.) more than in Lemma|71 
Using the definition of 7(71) in (fTTI) . we have 

7(r7) <n-'°s'"'^'(") <b^\n). 

In other words, this additional loss is small. 

Consider now a specific relay squarelet. If a source-destination pair can communicate over this relay 
squarelet at a rate at least 1/64-th of the rate achievable in the fast fading case (given by Lemmas [9] 
and (To]), it sends information over this relay. Otherwise it does not send anything during the period of 
time it is assigned this relay. We now show that, with probability 1 — o(l) as ti ^ 00, for every source- 
destination pair on every level of the hierarchy at least one quarter of its relay squarelets can support this 
rate. As we only communicate over a quarter of the relay squarelets, this implies that we can achieve at 
least 1/256-th of the per-node rate for the fast fading case (see Section IVII-AI) . i.e., that b{n)n^~°'^'^ is 
achievable with probability 1 — o(l)asn^oo. 

Assume we have for each source-destination pair {u,w) picked K22~^~^'j{n) dense squarelets over 

which it can relay; call those relay squarelets {A.u,w,k}kli ^ ■ Consider the event Bu,w,k that source 
node u can communicate at the desired rate to destination node w over relay squarelets A^ ^„ ^ (assuming, 
as before, that we can solve the communication problem within this squarelet). 
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Let {fiy ^ fc}f=i be the events that the interference due to matched filtering in the MAC phase, the 
interference from spatial re-use in the MAC phase, the interference due to beamforming in the BC phase, 
and the interference from spatial re-use in the BC phase, are less than 8 times the one for fast fading, 
respectively. From the proof of Lemmas [9l [TOl and of Theorem [T] for the fast fading case in Sectior fVILAi 
we see that 



n 



-'-'u,w,k ^ J-'u,w,k- 



Due to spatial re-use, multiple relay squarelets will be active in parallel. Let H denote the set of channel 
gains between active relay squarelets. Using essentially the same arguments as for the fast fading case 
(see Lemmas l9l [TOl and Section Ivn-Al) and from Markov's inequality, we have ¥{B^l^i,\H) > 7/8 for 

all z G {1, . . . ,4} and hence F{Bu,w,k\H) > 1/2. 
We now argue that the events 

r^t=iB^A (30) 

J fc=l 



are independent conditioned on H, by showing that these events depend on disjoint sets of channel gains 
and codebooks. Assuming the codebooks are generated new for each communication round, then they are 
all independent. Thus we only have to consider the dependence on the channel gains. Let Uk and Wk be 
the source and destination nodes communicating over relay squarelet A„ „, ^ in round k, and let Vk be the 
nodes in A„ „, ^.. Let f/^, Wk be the source and destination nodes that are communicating at the same time 
as {u,w) due to spatial re-use. Let Vk be the relay nodes of Uk and Wk- Now, B^ ^ f, and B^ ^ ^, depend 
(for fixed H) on the channel gains between Uk and Vk- i?„^ ;. depends on the channel gains between Vk 
and Wk- B^'^ f, depends (again for fixed H) on the channel gains between Vk and Wk- Since these sets 



are disjoint for different k by the orthogonality of the schedules (see (1291)). conditional independence of 
the events in (|30l ) follows. _ 

To summarize, conditioned on the channel gains H between active relay squarelets, the random variables 
{Ifi^^^lfc are independent and have expected value E(11b^ ^ j.|iJ) > 1/2. The sum 



K22-'^-^'y{n) 

fc=i 






is the number of relay squarelets over which the source-destination pair (u, w) successfully relays traffic. 
We now show that with high probability at least one quarter of these relay squarelets allow successful 
transmission. Applying the Chernoff bound yields that 

p(E.1b„,.,. < /^22-^"^(n)|#) < p(E.1b_,, < i^22-^-^(n)P(i?„,^,,|#)|#) 

< exp ( - 2K2-%(n)P(S„,^,fc|^)) 

< exp ( - A'2-^7(n)) 

for some constant Ji > 0. Since the right-hand side is the same for all H, this implies 

p(E.1b„,..,, < i^22"^-^7(r^)) < exp ( - KT'^{n)). 

In each of the L{n) levels of the hierarchy there are at most r? source-destination pairs, and hence by 
the union bound with probability at least 

1 - L{n)n^ exp ( - K2-^(")7(n)) , 
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for every source-destination pair on every level of the hierarchy at least one quarter of its relay squarelets 
can support the desired rate. By the choices of 7(77.) and L{n) in (fTTI) . this probability is at least 

1 - L{n)n^ exp ( - K2-^^''^{n)) >l-n^exp(- j^2-^(")2'°s(«)/2i(") j 

> 1 - exp ( K2^°^^°^^"'^ - j^2^^°^''"^'(")~^°^''""'("A 

>l-exp(-2^('°s^''+'("))) 
>l-o(l) 

as n — > 00, and for some constant K. This proves that the same order rate as in the fast fading case can 
be achieved with high probability for all levels < £ < L{n). 

It remains to argue that the same holds for level d = L{n). Note that since we assume phase fading 
only, the received signal power is only a function of distance and not of the fading realization. Since at 
level L{n) we use simple TDMA, this implies that we can always achieve the same rate at level L{n) as 
in the fast fading case. 

Hence with probability 1 — o(l) as n — i> 00, we achieve the same order rate at each level < £ < L{n) 
as for fast fading, proving Theorem [T] for the slow fading case. 

VIII. Proof of Theorem |2] 

Here, we provide a generalization and sharpening of the converse in [8]. Most of the arguments follow 
[8, Theorem 5.2]. We start by proving a lemma upper bounding the MIMO capacity. 

Consider two subsets 81,82 C V{n) such that 81^82 = 0. Assume we allow the nodes within 81 and 
5*2 to cooperate without any restriction. The maximum achievable sum rate between the nodes in 81 and 
5*2 is given by the MIMO capacity C(S'i, 82) between them. The next lemma upper bounds C(5'i, 82) in 
terms of the node distances between the two sets and the normalized channel gains 



Lemma 12. Under either fast or slow fading, for every a > 2, 81, 82 C V{n) with S*! fl 5*2 = 0, we have 

0(81,82) <a( max U,rmixJ2\hu,v\']) 5Z 5Z ^n^- 
Proof. Let 

be the matrix of (normalized) channel gains between the nodes in 81 and 5*2. Consider first fast fading. 
Under this assumption, we have 

0(81,82)= max e( logdet (I + H^Q(H)H)]. 

QiH)>0: \ ^ 'J 

Define 

ueSi veS2 
as the total received power in 5*2 from 81, and set 

Pu,S2 = P{u},S2 
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with slight abuse of notation. Then 

C(Si,S2)= max e( log det (l + H^Q(H)H)] 

QiH)>0: V / 

E(g^,^)<Pn,S2VMG5i 

< max Eilogdet (I + Wq(H)H)]. (31) 

Q(H)>0: \ ^ ' ) 



E{trQ(ff))<Psi.S2 

Define the event _^ 

for some h and where ||i/|| denotes the largest singular value of H. In words, B is the event that the 
channel gains between 5*1 and 5*2 are "good". We argue that, for appropriately chosen 6, the event B has 
probability zero (i.e., the channel can not be too "good"). By Markov's inequality 

P(S) < r^Edl^f*"), (32) 

for any m. We continue by upper bounding E(||i3"p"^). We have 

for any fc, and hence 

Edl^fff™) < E('(tr((^^^)^))"/'y (33) 

Now, for any k > m, we have by Jensen's inequality 

E((tr((S'5't)fc))™/A^j < (Eti{iHW)')y^\ (34) 

Combining (|32l), (|33]), and ^ yields 

/ . . \ m/k 

F{B) < b""^ (Etr {{HH^f)] (35) 

for any k > m. 

Now, the arguments in [8, Lemma 5.3] show that 



E(tr((if if 1")^')) < tkul max I 1, max J^ |/i„,„p I J 



ueSi 
where tk is the k-th Catalan number. Combining with (l35l) . this yields 



b~Hl^''n^/'' (max h,maxJ2\Kv\^})] 



nGSi 
,1/fc 



Taking the limit as A; ^ oo and using that tfj -^ 4 yields 

r^4('max|l,max^|/i„,„p|] ] 

^ «G5i ^ 

Assume 



6 > 4( max <i l,max N^ |/i„,;|^ n, (36) 

ueSi 



then taking the limit as m ^ oo shows that 

F{B) = 0. 
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Using this, we can upper bound (|3T| ) as 

C(Si,S2)< max ¥.(ti(H^Q{H)H 

Q(H)>0: \ \ 



Q{H)>0: 

EitTQ[H))<Ps,,s2 



max E(lB-tT(H^Q(H)H 

}{H)>0: \ y -*y ^ 

iH))<Ps„S2 

< max E(lB4H\\hTQ(H)] 

Q(H)>0: \^ M M /y 



Q{H)>0: 

E(trQ(ff))<Psj,s2 



Q{H)>0: 

E(trQ(ff))<Ps^,S2 

Since this is true for all b satisfying (l36l) . we obtain the lemma for the fast fading case. 
Under slow fading 

C{Si,S2)= max \ogdet {l + H^QH), 

qu,u<P VueSi 

and the lemma can be obtained by the same steps. D 

We now proceed to the proof of Theorem [21 Consider a vertical cut dividing the network into two 
parts. By the minimum-separation requirement, an area of size o{n) can contain at most o{n) nodes, and 
hence we can find a cut such that each part is of size 6(n) and contains 9(n) nodes. Call the left part of 
the cut 5". Since there are 6(n) nodes in S and in S^, there are Q(n) sources in S with their destination 
in 5''^ with probability 1 — o(l). For technical reasons we add a node inside each square in V{n) of the 
form [id, {i + l)d] x [jd, (j + l)d] for some i,j E N, where d = y21og(ri). These additional nodes have 
no traffic demands on their own, and simply help with the transmission. This can clearly only increase 
achievable rates. Moreover, this increases the number of nodes in V by less than a factor 2. We now show 
that 

CiS,S') = 0{log%ny-"/'), (37) 

and hence by the cut-set bound, and since there are Q{n) sources in S with their destination in S"^, we 
have 

p*(n)=0(log^Hni-"/2). 

We prove (l37l) using Lemma [l2l To this end, we need to upper bound 



max 

ues 



X^l^«,^ 



The proof of [8, Lemma 5.3] shows that if 

1) there are less than log(n) nodes inside [i, i + 1] x [j,j + 1] for any i,j G {0, . . . , y^ — 1}, 



2) there is at least one node inside [id, {i + l)d] x [jd, {j + l)d] for any i,j, where d = a/2 log n, 
then 



'-J2\Kv\^<Klog\n), (38) 

and for a E (2, 3] 



max 



$^$^r-e</nog3Hn2-"/2, (39) 



ugs ^gs= 



for constants K, K. For arbitrary node placement with minimum separation, the first requirement is 
satisfied for n large enough, since only a constant number of nodes can be contained in each area of 
constant size. By our addition of nodes into V{n) described above, the second condition is also satisfied. 
Using Lemma [l2] with (l38l) and (l39l) yields (l37l) . concluding the proof of Theorem |2l 
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IX. Proof of Theorem [3] 

Consider a node placement with n/2 nodes located uniformly on [0, \/n/A\ x [0, y/n\ and n/2 nodes 
located on \^/n/2, ^/n\ x [0, ^/n\ with minimum separation r^^^ = 1/2. A random traffic matrix A(n) is 
such that at least n/A communication pairs have their sources in the left cluster and destinations in the 
right cluster with probability 1 — o(l). Assume we are dealing with such a \{n) in the following. 

In this setup, with multi-hop at least one hop has to cross the gap between the left and the right cluster. 
Thus, even without any interference from other nodes, we can obtain at most 



Moreover, considering a cut between the two clusters (say, S and S'^), and applying Lemma [121 yields 
p*{n) < 16n-^rmax|l,max^|/i„,,|2|') ^^r-;. (40) 

Now note that for any u E S, v E S^, we have 



1 ^ ^ 



Hence 



and 



fp—a 



ues ues ^i&S'= u,v 



3a 



Combining this with (1401) yields 
for all a > 2. 



E E -ur. < r-'n'-^'. 



X. Proof of Theorem [4] 

We construct a cooperative multi-hop communication scheme and lower bound the per-node rate 
p^^^(n) it achieves. We use the hierarchical relaying scheme as building block. Assume the node 
placement V{n) is /i-regular at resolution d{n) for all n > 1. We show that this implies that we can 
achieve a per-node rate of at least (P~°' (n)n~^/'^~^^'^'^ as n ^ oo. Taking the smallest such d{n) then 
yields the result. 

We consider three cases for the value of d{n) (namely, d{n) = 0(-\/ra), d{n) > n°^^\ and d{n) < n°^^^). 
First, if d{n) = Q(y/n) as n ^ oo then the result follows directly from Theorem \T\ Considering a 
subsequence if necessary, we can therefore assume without loss of generality that d(n) = o{^/n) in the 
following. 

Second, consider d{n) satisfying 

d(r2) >n^'°s'"'^'("). (41) 

Divide A{n) into squares of sidelength d(n). Since d{n) = o{^/n), the number of such squares grows 
unbounded as n ^ oo. We now show that we can use multi-hop communication with a hop length of d{n) 
where each hops is implemented by squares cooperatively sending information to a neighboring square. 
In other words, we perform cooperative communication at local scale d{n) and multi-hop communication 
at global scale y/n. 

Since V{n) is //-regular at resolution d{n), each such square contains at least nd^iji) nodes. Pick the 
top left most square and construct the square of sidelength 2d{n) consisting of it together with its 3 
neighbors. Continue in the same fashion, partitioning all of A{n) into squares of sidelength 2d{n). Note 
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that each such bigger square contains at least AjjiCpin) nodes by the definition of d{n). We assume this 
worst case in the following. Partition A{n) into 4 subsets of those bigger squares such that within each 
such subset each square is at distance at least 2d{n) from any other square (see Figure (6]). We time share 
between those 4 subsets. Consider in the following one such subset. For every bigger square, we construct 
two permutation traffic matrices \i{AiicP{n)) and A2(4//(i^(n)). In Ai the nodes in the top two squares 
have as destinations the nodes in the bottom two squares and the nodes in the bottom two squares have as 
destinations the nodes in the top two squares (see Figure (6]). Similarly, A2 contains communication pairs 
between left and right squares. We time share between Ai and A2. 
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Fig. 6. Sketch of the construction of the cooperative multi-hop scheme in the proof of Theorem |4] The dashed squares have sidelength 
d{n). The gray area is one of the 4 subsets of bigger squares that communicate simultaneously. The arrows indicate the traffic matrix Ai. 

Communication according to A^ within bigger squares in the same subset occurs simultaneously. We 
are going to use hierarchical relaying within each bigger square. This is possible since each such square 
contains at least AfjiCpin) nodes. We have to show that the additional interference from bigger squares in 
the same subset is such that Theorem [T] still applies. In particular, we need to show that the interference 
has bounded power, say K. Using the same arguments as in the proof of Theorem [T] in Section IVIH yields 
that this is indeed the case (the interference from other bigger squares here behaves the same way as 
the interference due to spatial re-use from other active relay squarelets there). With this, we are now 
dealing with a hierarchical relaying scheme with area AcP{n), Afi(f(n) nodes, and additive noise with 
power 1 + K. Both the lower number of nodes and the higher noise power will decrease the achievable 
per-node rate by only some constant factor, and hence Theorem [T] shows that under fast fading we can 
achieve a per-node rate of at least 



h{d^{n)){d\n)Y-''/^ > 6i(n)rf2-"(n) 



as n — > oo, where 



oi[n) > n \ ^ ^ '' . 
Moreover, the same rate is achievable under slow fading with probability 1 — b2{d^{n)), where 



The setup is the same for all bigger squares within each of the 4 subsets. 

We now "shift" the way we defined the bigger squares by d{n) to the right and to the bottom. With this, 
each new bigger square intersects with 4 bigger squares as defined before. We use the same communication 
scheme within these new bigger squares and time share between the two ways of defining bigger squares. 
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Construct now a graph where each vertex corresponds to a square of sidelength d{n) and where two 
vertices are connected by an edge if they are adjacent in either the same old or new bigger square. This 
graph is depicted in Figure |4] in Section IVl 

With the above construction, we can communicate along each edge of this graph simultaneously at a 
per-node rate of 

16 ^ ' 

in the fast fading case. In the slow fading case, this statement holds with probability at least 

> 1 - exp ('a"2^°^^°^("^ - 2^i°s''"^'('^("))' 



for constants K', K. By assumption (|4TI) . 

1/2+5 



logV2+^(rfH)>(-^logV2+^ 
\2 + a 



n 

+ 



and hence 



^ ' 'J2/ 



l-^^^h,{d\n))>l-o{l) 



as n — > oo, showing that with high probability we achieve the same order rate under slow fading as under 
fast fading. 

The communication graph constructed forms a grid with n/d?{n) nodes. Using that each bigger square 
can contain at most Ki(P(n) nodes by the minimum-separation requirement, standard arguments for 
routing over grid graphs (see [16]) show that in the fast fading case we can achieve a per-node rate of 



jn 
where 



h{n)=n^<'°^'-'''^-~^)^ 



Moreover, the same statement holds in the slow fading case with probability 1 — o(l). 
Finally, consider d{n) such that 

d(n) <n^'°*5'"'^'("). (42) 



Construct the same communication graph as before, but this time we use simple multi-hop communication 
between adjacent squares of sidelength d{n). By time sharing between the at most Kid'^{n) nodes in each 
square, and since we communicate over a distance of at most ?)d{n), we achieve under either fast of slow 
fading a per-node rate between the squares of at least 

for some constant K" , and where we have used (|42l) . Using the analysis of grid graphs as before, we can 
achieve a per-node rate of at least 



\ n 



for either the fast or slow fading case. 
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XL Proof of Theorem [5] 

Consider V{n) with n/2 nodes located uniformly on [0,(^/n — d*{n))/2] x [0, i/rT] and n/2 nodes 
located uniformly on [\/n/2, ^/n\ x [0, ^/n\ such that rmin = 1/2. This node placement is 1/2-regular at 
resolution d*{n). A random traffic matrix \{n) is such that Q{n) communication pairs have their sources 
in the left cluster and destinations in the right cluster with probability 1 — o(l). Assume we are dealing 
with such a X{n) in the following. 

Considering a cut between the two clusters and applying Lemma [I2l (slightly adapting the arguments 
in Section IVIII|) . yields that 

p*{n) = 0(log^(n)t/*'-°(n)n-i/2) 

for a > 3. 

XIL Discussion 

We briefly discuss several aspects of the proposed hierarchical relaying scheme. Section IXILAI comments 
on the full CSI assumption and Section IXILBI on the use of bursty communication. Sections IXILCI and 
IXILDI outline how the results obtained here can be extended to the case of dense networks and networks 
without minimum separation between nodes. Section IXILEI compares our hierarchical relaying scheme to 
the hierarchical cooperation scheme presented in [8]. 

A. Full CSI Assumption 

Throughout our analysis, we have made a full CSI assumption. In other words, we assumed that the 
phase shifts {6'u,t,[t]}u,^ are available at time t at all nodes in the network. As this assumption is quite 
strong, it is worth commenting on. First, we make the full CSI assumption in all the converse results in 
this paper. This implies that all the converses also hold under weaker assumptions on the CSI, and hence 
are valid as well under a wide variety of more realistic assumptions on the availability of side information. 
Second, all achievability results can be shown to hold under weaker assumptions on the availability of 
CSI. In fact, in all cases, a 2-bit quantization of the channel state {6'u ,;[t]}tj „ available at all nodes at time 
t is sufficient to obtain the same scaling behavior. This follows by an argument similar to the one used 
in the analysis of the BC phase in Section IVI-Cl where it is shown that beamforming using a quantized 
channel state results only in a constant factor rate loss. 

B. Burstiness of Hierarchical Relaying Scheme 

The hierarchical relaying scheme presented here is bursty in the sense that nodes communicate at high 
power during a small fraction of time. This leads to high peak-to-average power ratio, which is undesirable 
in practice. We chose burstiness in the time domain to simplify the exposition. The same bursty behavior 
could be achieved in a more practical manner by using CDMA with several orthogonal signatures or by 
using OFDM with many sub-carriers. Each approach leads to many parallel channels out of which only 
few are used with higher power. This avoids the issue of high peak-to-average power ratio in the time 
domain. 

C. Dense Networks 

Throughout this paper, we have only considered extended networks, i.e, n nodes placed on a square 
region of area n with a minimum separation of r^.v > ^min- The results can, however, be recast for dense 
networks, where n nodes are arbitrarily placed on a square region of unit area with a minimum separation 
of ru,v > ''"min/v^- It suffices to notice that by rescaling power by a factor n""/^ a dense network can 
essentially be transformed into an extended network with path-loss exponent a (see also [8]). Hence the 
same result for dense networks can be obtained from the result for extended networks by considering the 
limit a — > 2. Applying this to Theorem HJ yields a linear per-node rate scaling of the hierarchical relaying 
scheme. 
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D. Minimum-Separation Requirement 

The minimum- separation requirement rmin ^ (0, 1) on the node placement is sufficient but not necessary 
for Theorem [H to hold. A weaker sufficient condition is that a constant fraction of squarelets are dense, as 
shown in Lemma |6] to be a consequence of the minimum- separation requirement. It is straightforward to 
show that this weaker condition is satisfied with high probability for nodes placed uniformly at random 
on [0, y/n\^ ■ This yields a different proof of Theorem 5.1 in [8]. 

E. Comparison with [8] 

Both, the hierarchical relaying scheme presented here and the hierarchical scheme presented in [8], 
share that they use virtual multiple-antenna communication and a hierarchical architecture to achieve 
essentially global cooperation in the network. The schemes differ, however, in several key aspects, which 
we point out here. 

First, we note that we obtain a slightly better scaling law. Namely 

bi{n)n^-"/^ < p*{n) < 62(^)^2^-"/^ 
with 

b2{n) = 0{\og%n)), 
for any 6 E (0, 1/2) obtained here, compared to 

bi{n)n^-"/^ < p*{n) < 62(^1)^^""/^ 

with 

bi{n) = n{n-'), 

b,{n) = 0{n'), 

for any e > in [8]. For the lower bound (i.e., achievability), this is because the hierarchy here is not 
of fixed depth L as in [8], but rather of depth L{n) = log^'^~ (n) (for some constant 6 G (0, 1/2)), i.e., 
changing with n. For the upper bound (i.e., converse), this is due to a sharpening of the arguments in [8]. 

Second, note that the multi-user decoding at the relay squarelets during the MAC phase and the multi- 
user encoding during the BC phase are very simple in our setup. In fact, using matched filter receivers 
and transmit beamforming, we convert the multi-user encoding and decoding problems into several single- 
user decoding and encoding problems. This differs from the approach in [8], in which joint decoding of a 
number of users on the order of the network size is performed. Our results thus imply that these simpler 
transmitter and receiver structures provide the same scaling as the more complicated joint decoding in [8]. 
We note that the scheme proposed in [8] can be modified to also use matched filter receivers as suggested 
here. 

Third, and probably most important, the schemes differ in how they achieve the throughput gain from 
using multiple antennas. In [8], the nodes are located almost regularly with high probability. This allowed 
the use of a scheme in which a source squarelet directly communicates with a destination squarelet. In 
other words, the multiple- antenna gain comes from setting up a virtual MIMO channel between the source 
and the destination. In our setup, the arbitrary location of nodes prevents such an approach. Instead, we use 
that at least some fixed fraction of squarelets is almost regular (we called them dense squarelets). Source- 
destination pairs relay their traffic over such a dense squarelet. In other words, the multiple- antenna gain 
comes from setting up a virtual multiple-antenna MAC and BC. Thus, the hierarchical relaying scheme 
presented here shows that considerably less structure on the node locations than assumed in [8] suffices to 
achieve a multiple- antenna gain essentially on the order of the network size. Note also that the additional 
degree of freedom offered by the choice of relay squarelet for a given source-destination pair makes it 
possible to extend the result to hold also for slow fading channels. 
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XIII. Conclusions 

We considered the problem of the scaling of achievable rates in arbitrary extended wireless networks. We 
generalized the hierarchical cooperative communication scheme presented in [8] for a fast fading channel 
model and with random node placements. We proposed a different hierarchical cooperative communication 
scheme, which also works for arbitrary node placement (with a minimum-separation requirement) and for 
either fast or slow fading. 

For small path-loss exponent ol G (2, 3], we showed that our scheme is order optimal and achieves the 
same rate irrespective of the node placement. In particular, this rate is equal to the one achievable under 
random node placement. In other words, the regularity of the node placement has no impact on achievable 
rates for small path-loss exponent. 

The situation is, however, quite different for large path-loss exponent a > 3. We argued that in 
this regime the regularity of the node placement directly impacts the scaling of achievable rates. We 
then presented a cooperative communication scheme that smoothly "interpolates" between multi-hop and 
hierarchical cooperative communication depending on the regularity of the node placement. We showed 
that this scheme is order optimal for all a > 3 under adversarial node placement with regularity constraint. 
This contrasts with the situation for more regular networks (like the ones obtained with high probability 
through random node placement), in which multi-hop communication is order optimal for all a > 3. 
Thus, for less regular networks, the use of more complicated cooperative communication schemes can be 
necessary for optimal operation of the network. 
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